<!-- Some styling for better description lists --><style type='text/css'>dt { font-weight: bold;float: left;display:inline;margin-right: 1em} dd { display:block; margin-left: 2em}</style> ***: Guest6009 is now known as ndufresne <br> eelstrebor has quit IRC (Quit: Ex-Chat) <br> camus has quit IRC (Read error: Connection reset by peer) <br> camus has joined #linux-media <br> b-rad has quit IRC (Ping timeout: 480 seconds) <br> b-rad has joined #linux-media <br> NiksDev has joined #linux-media <br> jm_h has joined #linux-media <br> GBenji has joined #linux-media <br> ao2 has joined #linux-media <br> Svenska has left <br> gouchi has joined #linux-media pinchartl: I've just realized that calling VIDIOC_EXPBUF on the same buffer twice produces two different dmabuf instances :-S <br> is there a reason for that, or could it be fixed ? kbingham: pinchartl, Does two dmabuf instances mean two separate inodes? (I assume so) <br> So we can't identify them as being the same source... pinchartl: yes it does <br> exactly :-) hverkuil: <u>pinchartl</u>: good question. I don't think anyone thought about that. <br> My gut feeling says that this is a bug and should be addressed (i.e. return the same fd with incremented refcount) <br> (well, perhaps not quite a bug, but certainly unexpected) pinchartl: I share the same feeling ***: mriesch has quit IRC (Remote host closed the connection) <br> mriesch has joined #linux-media <br> djrscally has joined #linux-media <br> epoll has quit IRC (Ping timeout: 480 seconds) <br> epoll has joined #linux-media dv_: for my understanding: stateless mem2mem decoders only get the bare minimum data, that is, slices etc., no actual bitstreams, correct? <br> I ask because the docs say that these decoders still work on per-frame basis, that is, they need the entire info for a frame ndufresne: Only the slices, there is control that tell us if we pass the slice with or without a startcode (that is HW specific) <br> and there is also a control to check if they work per frame or per slice dv_: but isnt a "slice" a rather h264/h265 specific concept? ndufresne: all per frame decoder requires startcode to figure-out where the slices starts dv_: or do vp8 and such also have slices? ndufresne: indeed, VP8/VP9 are frame only <br> the controls are per codec dv_: k ndufresne: for VP8/VP9 all HW comes from Hantro/Google, even though their programming interface vary, so the workflow is pretty much always identical <br> for mpeg2, all decoders are frame based dv_: av1? ***: camus has quit IRC () ndufresne: for now, av1 is frame based, but requires per tile info, we are adding the concept of dynamic array control <br> this will also be used in HEVC for rkvdec and RPi <br> but perhaps this is too much detail ? dv_: hm nah its fine ndufresne: in av1, there is this concept of enhancement layer, the workflow isn't defined, so we aren't sure yet how to support that <br> its used in the AV1 image format <br> AV1F, or AVIF, not sure .. dv_: like ROI? ndufresne: no, more like progressive loading <br> like having the base layer at 1080p, and enhancement layer to 4K as an example <br> or even smaller, aka thumbnail size for really fast indexing <br> but the layers are passed through the container, it's not signalled in bitstream <br> perhaps it will be similar to VP9, that we'll need the ability to allocate buffers in various resolution, and reference the previous decode like we do for reference frames <br> <u>hverkuil</u>: I'm hitting a hang with Cedrus driver, but the backtrace indicates some sort of deadlock, v4l2_m2m_cancel_job and v4l2_release are involved <br> If you have any hint/idea what cedrus could be doing wrong, it's not clear to me, here's the info I managed to get, https://gitlab.collabora.com/-/snippets/115 hverkuil: <u>ndufresne</u>: it looks like v4l2_m2m_cancel_job() stalls. It might be waiting on an interrupt or some other event, and that never happens. I don't think it is a deadlock, looking at the logs. <br> <u>ndufresne</u>: is it reproducible, or is it a random occurrence. <br> ? ndufresne: <u>hverkuil</u>: I have a way to reproduce pretty much always <br> This only happens if I run concurrent decodes, if I run the tests 1 by one is passes, with really good score in fact <br> <u>hverkuil</u>: jernej said there is no code to "cancel" a trigger in cedrus, so indeed, perhaps an interrupt didn't happen, or got missed hverkuil: I'll bet v4l2_m2m_cancel_job() is waiting forever for wait_event(m2m_ctx->finished, ...) ndufresne: I think I would need some sort of trigger vs interrupt counter <br> <u>hverkuil</u>: jernej is worried since the HW parser is enabled, and it might be leaving some state that badly interract when you interleave frames from different streams <br> it's all RE, so we don't know everything there is to know about it, imho, in the context, if there was some workaround it would be fine hverkuil: I suggest checking first if it is indeed the wait_event that never ends. If so, then check if _v4l2_m2m_job_finish() was actually called. For all I know it can be wrong logic there since we almost never test with concurrent decodes. I don't think I have even tried that with vicodec or vim2m. ndufresne: ok, I guess for that I need to badger few printk myself% <br> I think the backtrace on hung process was useful though, I feel much closer then I was last week ;-D <br> <u>hverkuil</u>: thanks for the advise, this is work I do when I have spare time, so expect some delays, with that fix, I might have enough to start looking into automated CI <br> I got notes on how to run jobs into kernel CI lab, in gst, I'll use gitlab CI as a driver, what do we have in linux-media to driver kernel-ci jobs ? ***: camus has joined #linux-media <br> b-rad has quit IRC (Remote host closed the connection) hverkuil: <u>ndufresne</u>: I don't really know what kernel-ci uses. We don't have anything for kernel-ci in our repos. I think they mostly run v4l2-compliance. <br> In my daily regression test I use the test-media script in contrib/test in v4l-utils. That tests all the media vi* drivers, and it does a pretty decent job. broonie: Yeah, kernelci is running v4l-compliance. ***: b-rad has joined #linux-media hverkuil: <u>ndufresne</u>: you said that it is reproduced by running concurrent decoders. Have you tried that as well with other drivers besides cedrus? Basically is this a cedrus-specfic issue or a more generic m2m framework issue? If it is the latter, then I definitely need to get involved to try and find a solution. <br> One thing that concerns me in the m2m framework is that it called wait_event and not wait_event_interruptible. I'm not sure if that's right. <br> <u>ndufresne</u>: actually you might try replacing wait_event with wait_event_interruptible in v4l2-mem2mem.c: see if it is then possible to break off when it hangs. It might break somewhere else though, since there might be a good reason why wait_event is used. ndufresne: <u>hverkuil</u>: I also saw issues with other drivers, but never got to prove it was the same <br> for me, Hantro G1 is rather stable (despite the poor score) <br> rkvdec issues was mostly due to the overclocking in the driver, dropping that huge hack seems to fix this <br> so for now, I'd say its only cedrus ezequielg: <u>hverkuil</u>: in fact, i think i had that discussion with Robert Beckett (not here in irc) about using wait_event_interruptible. <br> but i think that could lead to use-after-free, couldn't it? <br> the core will release resources, but the driver might still be holding to them. <br> <u>ndufresne</u>: fwiw, job_abort is optional, so drivers don't need to support that job_abort in order to be cancelled. the issue imo sounds like hw stalling. <br> now, if there's no support for concurrency, then cedrus should change that in its open(), and in the stream start or something to prevent two contexts from running. jernej: <u>ezequielg</u>: Only VP8 is problematic and even here, it should be ok to interleave frames in theory ezequielg: OK. jernej: it certainly boils down to not completely understand inner working ezequielg: I mean, we can add some sort of wait_even_timeout, but it should have a very noise warning :) <br> <u>jernej</u>: if the hw has some internal state, how can it interleave? jernej: coefficients are stored in auxiliary buffers which are programmed in registers <br> and iirc I didn't use one update flag, so there is still hope to make it somehow work ***: eelstrebor has joined #linux-media <br> GBenji has left <br> b-rad has quit IRC (Ping timeout: 480 seconds) <br> zhxuxu_ has joined #linux-media <br> zhxuxu has quit IRC (Ping timeout: 480 seconds) <br> eelstrebor has quit IRC (Ping timeout: 480 seconds) <br> zhxuxu_ is now known as zhxuxu <br> zhxuxu has quit IRC (Quit: Leaving) <br> zhxuxu has joined #linux-media <br> jarthur has joined #linux-media <br> camus has quit IRC () <br> b-rad has joined #linux-media <br> eballetbo has quit IRC (Quit: Ping timeout (120 seconds)) <br> eballetbo has joined #linux-media <br> paulk1 has joined #linux-media <br> paulk has quit IRC (Ping timeout: 480 seconds) <br> ao2 has quit IRC (Quit: Leaving) <br> jm_h has quit IRC (Remote host closed the connection) <br> gouchi has quit IRC (Remote host closed the connection) <br> djrscally has quit IRC (Ping timeout: 480 seconds) <br> jarthur has quit IRC (Ping timeout: 480 seconds) <br> NiksDev has quit IRC (Ping timeout: 480 seconds) <br> eelstrebor has joined #linux-media <br> jarthur has joined #linux-media