
文中摘自微信公众平台「运维开发小故事」,创作者运维开发小故事。转截文中请联络运维开发小故事微信公众号。
一、引言
问题排查全过程,源代码一部分均由我的开发朋友排查和纪录;在征求其允许后,由我发布在这里。
二、问题
某一天收到客户意见反馈,pod的情况中发生大批量的 warning event: Readiness probe failed: OCI runtime exec failed: exec failed: EOF: unknown。但不危害客户浏览该服务项目。
三、自然环境
尤其表明:客户在承担运作业务流程的k8s连接点上坚持不懈打开了cpu-manager
部件
版本
k8s |
1.14.x |
|
四、排查
1、收到客户意见反馈后,查验该pod所属连接点的kubelet日志,如下所示:
I050703:43:28.31063057003prober.go:112]Readinessprobefor"adsfadofadfabdfhaodsfa(d1aab5f0-ae8f-11eb-a151-080027049c65):c0"failed(failure):OCIruntimeexecfailed:execfailed:EOF:unknownI050707:08:49.83409357003prober.go:112]Readinessprobefor"adsfadofadfabdfhaodsfa(a89a158e-ae8f-11eb-a151-080027049c65):c0"failed(failure):OCIruntimeexecfailed:execfailed:unexpectedEOF:unknownI050710:06:58.30788157003prober.go:112]Readinessprobefor"adsfadofadfabdfhaodsfa(d1aab5f0-ae8f-11eb-a151-080027049c65):c0"failed(failure):OCIruntimeexecfailed:execfailed:EOF:unknown
probe的不正确种类为failure,相匹配编码如下所示:照片2、再查询docker日志,如下所示:

time="2021-05-06T16:51:40.009989451 08:00"level=errormsg="streamcopyerror:readingfromaclosedfifo"time="2021-05-06T16:51:40.010054596 08:00"level=errormsg="streamcopyerror:readingfromaclosedfifo"time="2021-05-06T16:51:40.170676532 08:00"level=errormsg="Errorrunningexec8e34e8b910694abe95a467b2936b37635fdabd2f7b7c464dfef952fa5732aa4eincontainer:OCIruntimeexecfailed:execfailed:EOF:unknown"
尽管从Docker日志中表明是 stream copy error,但事实上是最底层的 runc 回到了 EOF,造成回到了 error。3、由于日志中表明 probe 种类为 Failure,因而 e.CombinedOutPut() 的 err != nil,而且 ExitStatus 不以 0,data 的数值 OCI runtime exec failed: exec failed: unexpected EOF: unknown,最后会启用到 RunInContainer 方式

ExecSync 是根据 GRPC 启用了 dockershim 的 ExecSync

dockershim 最后启用到 ExecInContainer 方式,而且该方式 的回到了 exitcode 不以 0 的 error。
func(*NativeExecHandler)ExecInContainer(clientlibdocker.Interface,container*dockertypes.ContainerJSON,cmd[]string,stdinio.Reader,stdout,stderrio.WriteCloser,ttybool,resize<-chanremotecommand.TerminalSize,timeouttime.Duration)error{execObj,err:=client.CreateExec(container.ID,createOpts)startOpts:=dockertypes.ExecStartCheck{Detach:false,Tty:tty}streamOpts:=libdocker.StreamOptions{InputStream:stdin,OutputStream:stdout,ErrorStream:stderr,RawTerminal:tty,ExecStarted:execStarted,}err=client.StartExec(execObj.ID,startOpts,streamOpts)iferr!=nil{returnerr}ticker:=time.NewTicker(2*time.Second)deferticker.Stop()count:=0for{inspect,err2:=client.InspectExec(execObj.ID)iferr2!=nil{returnerr2}if!inspect.Running{ifinspect.ExitCode!=0{err=&dockerExitError{inspect}}break}count ifcount==5{klog.Errorf("Execsession%sincontainer%sterminatedbutprocessstillrunning!",execObj.ID,container.ID)break}<-ticker.C}returnerr}
ExecInContainer 干了下列一些事:
启用 CreateExec 建立 ExecID
启用 StartExec 实行 exec,并根据 holdHijackedConnection 来跳转I/O。将 inputStream 载入到 connection,并将 response stream 跳转到 stdout,stderr。
启用 InspectExec 获得此次 exec 的运转情况和 exitcode
那麼日志中复印的出错便是 response stream 传送来的标识符流。换句话说,dockerd 的 response 中包括了不正确值。

这时去 docker 编码中找到缘故,ExecStart 会启用到 dockerd 的下列编码:

依据上边 docker 的日志,err 的错误报告为:OCI runtime exec failed: exec failed: EOF: unknown。换句话说 ContainerExecStart 回到了不正确。ContainerExecStart 会启用到 containerd.Exec,也就是 dockerd 和 containerd 中间开展通讯

//docker/libcontainerd/client_daemon.go//Execcreatesexecprocess.////ThecontainerdclientcallsExectoregistertheexecconfigintheshimside.//WhentheclientcallsStart,theshimwillcreatestdinfifoifneeds.But//forthecontainermainprocess,thestdinfifowillbecreatedinCreatenot//theStartcall.stdinCloseSyncchannelshouldbeclosedafterStartexec//process.func(c*client)Exec(ctxcontext.Context,containerID,processIDstring,spec*specs.Process,withStdinbool,attachStdioStdioCallback)(int,error){ctr:=c.getContainer(containerID)ifctr==nil{return-1,errors.WithStack(newNotFoundError("nosuchcontainer"))}t:=ctr.getTask()ift==nil{return-1,errors.WithStack(newInvalidParameterError("containerisnotrunning"))}ifp:=ctr.getProcess(processID);p!=nil{return-1,errors.WithStack(newConflictError("idalreadyinuse"))}var(pcontainerd.Processriocio.IOerrerrorstdinCloseSync=make(chanstruct{}))fifos:=newFIFOSet(ctr.bundleDir,processID,withStdin,spec.Terminal)deferfunc(){iferr!=nil{ifrio!=nil{rio.Cancel()rio.Close()}}}()p,err=t.Exec(ctx,processID,spec,func(idstring)(cio.IO,error){rio,err=c.createIO(fifos,containerID,processID,stdinCloseSync,attachStdio)returnrio,err})iferr!=nil{close(stdinCloseSync)return-1,wrapError(err)}ctr.addProcess(processID,p)//Signalc.createIOthatitcancallCloseIO////thestdinofexecprocesswillbecreatedafterp.Startincontainerddeferclose(stdinCloseSync)iferr=p.Start(ctx);err!=nil{//usenewcontextforcleanupbecauseoldonemaybecancelledbyuser,butleaveatimeouttomakesure//wearenotwaitingforeverifcontainerdisunresponsiveortoworkaroundfifocancellingissuesin//oldercontainerd-shimctx,cancel:=context.WithTimeout(context.Background(),45*time.Second)defercancel()p.Delete(ctx)ctr.deleteProcess(processID)return-1,wrapError(err)}returnint(p.Pid()),nil}
这儿 new 了一个 FIFOSet,而 reading from a closed fifo 仅发生在 fifo 被 close 掉时,依然在读取的状况。即 f.Close() 产生在 f.Read() 前边。在外层可以看到
deferfunc(){iferr!=nil{ifrio!=nil{rio.Cancel()rio.Close()//这儿Close会导致fifoclose}}}()p,err=t.Exec(ctx,processID,spec,func(idstring)(cio.IO,error){rio,err=c.createIO(fifos,containerID,processID,stdinCloseSync,attachStdio)returnrio,err})iferr!=nil{close(stdinCloseSync)return-1,wrapError(err)}ctr.addProcess(processID,p)//Signalc.createIOthatitcancallCloseIO////thestdinofexecprocesswillbecreatedafterp.Startincontainerddeferclose(stdinCloseSync)//p.Start出错,会导致内部结构的fifo关掉,进而导致readingfromaclosedfifo的问题iferr=p.Start(ctx);err!=nil{//usenewcontextforcleanupbecauseoldonemaybecancelledbyuser,butleaveatimeouttomakesure//wearenotwaitingforeverifcontainerdisunresponsiveortoworkaroundfifocancellingissuesin//oldercontainerd-shimctx,cancel:=context.WithTimeout(context.Background(),45*time.Second)defercancel()p.Delete(ctx)ctr.deleteProcess(processID)return-1,wrapError(err)}
p.Start 调用到接下来的编码,根据 GRPC 和 containerd 通讯。
//github.com/containerd/containerd/task.gofunc(t*task)Start(ctxcontext.Context)error{r,err:=t.client.TaskService().Start(ctx,&tasks.StartRequest{ContainerID:t.id,})iferr!=nil{t.io.Cancel()t.io.Close()returnerrdefs.FromGRPC(err)}t.pid=r.Pidreturnnil}
这一 GRPC 调用会抵达 containerd 下列的编码:
func(e*execProcess)start(ctxcontext.Context)(errerror){var(socket*runc.Socketpidfile=filepath.Join(e.path,fmt.Sprintf("%s.pid",e.id)))ife.stdio.Terminal{ifsocket,err=runc.NewTempConsoleSocket();err!=nil{returnerrors.Wrap(err,"failedtocreateruncconsolesocket")}defersocket.Close()}elseife.stdio.IsNull(){ife.io,err=runc.NewNullIO();err!=nil{returnerrors.Wrap(err,"creatingnewNULLIO")}}else{ife.io,err=runc.NewPipeIO(e.parent.IoUID,e.parent.IoGID,withConditionalIO(e.stdio));err!=nil{returnerrors.Wrap(err,"failedtocreaterunciopipes")}}opts:=&runc.ExecOpts{PidFile:pidfile,IO:e.io,Detach:true,}ifsocket!=nil{opts.ConsoleSocket=socket}//err回到了execfailed:EOF:unknown//这儿的runtime便是runc的二进制文件运行命令iferr:=e.parent.runtime.Exec(ctx,e.parent.id,e.spec,opts);err!=nil{close(e.waitBlock)returne.parent.runtimeError(err,"OCIruntimeexecfailed")}
Exec 的源代码如下所示:
//Execexecutresandadditionalprocessinsidethecontainerbasedonafull//OCIProcessspecificationfunc(r*Runc)Exec(contextcontext.Context,idstring,specspecs.Process,opts*ExecOpts)error{f,err:=ioutil.TempFile(os.Getenv("XDG_RUNTIME_DIR"),"runc-process")iferr!=nil{returnerr}deferos.Remove(f.Name())err=json.NewEncoder(f).Encode(spec)f.Close()iferr!=nil{returnerr}args:=[]string{"exec","--process",f.Name()}ifopts!=nil{oargs,err:=opts.args()iferr!=nil{returnerr}args=append(args,oargs...)}cmd:=r.command(context,append(args,id)...)ifopts!=nil&&opts.IO!=nil{opts.Set(cmd)}ifcmd.Stdout==nil&&cmd.Stderr==nil{data,err:=cmdOutput(cmd,true)iferr!=nil{returnfmt.Errorf("%s:%s",err,data)}returnnil}ec,err:=Monitor.Start(cmd)iferr!=nil{returnerr}ifopts!=nil&&opts.IO!=nil{ifc,ok:=opts.IO.(StartCloser);ok{iferr:=c.CloseAfterStart();err!=nil{returnerr}}}status,err:=Monitor.Wait(cmd,ec)iferr==nil&&status!=0{err=fmt.Errorf("%sdidnotterminatesucessfully",cmd.Args[0])}returnerr}
因而是 runc 在运作后輸出了 exec failed: EOF: unknown 这一不正确。

将 runc 命令循环系统实行,可少许重现。通过清查,发觉 runc exec 在运转过程中会载入 container 的 state.json,并应用 json decode 时发现异常。

这时想到到打开 kubelet cpu-manager 后,会 update container,也就是升级这一 state.json 文档。造成 runc 读到了一部分 cpu-manager 升级的內容。进而造成 json decode 不成功。这时清查 runc EOF 和 kubelet cpu-manager update container(默认设置每 10s 升级一次) 的時间,发觉时间点恰好符合,认证猜测。
查询 runc 是不是有修补,发觉了这一 pr: https://github.com/opencontainers/runc/pull/2467。 修补构思是将 saveState 变为原子操作,那样就不可能发生载入 state.json 时,读到一部分载入的內容,造成 unexpected EOF (或 EOF)的问题
//原先的func(c*linuxContainer)saveState(s*State)error{f,err:=os.Create(filepath.Join(c.root,stateFilename))iferr!=nil{returnerr}deferf.Close()returnutils.WriteJSON(f,s)}//修复后的func(c*linuxContainer)saveState(s*State)(retErrerror){tmpFile,err:=ioutil.TempFile(c.root,"state-")iferr!=nil{returnerr}deferfunc(){ifretErr!=nil{tmpFile.Close()os.Remove(tmpFile.Name())}}()err=utils.WriteJSON(tmpFile,s)iferr!=nil{returnerr}err=tmpFile.Close()iferr!=nil{returnerr}stateFilePath:=filepath.Join(c.root,stateFilename)returnos.Rename(tmpFile.Name(),stateFilePath)}
五、处理
关掉cpu-manager
更新runc