<!-- Some styling for better description lists --><style type='text/css'>dt { font-weight: bold;float: left;display:inline;margin-right: 1em} dd { display:block; margin-left: 2em}</style>

   hverkuil: paulk-leonov: if we require that certain controls must be present in a request, then we need a control flag to indicate that to userspace. Users need to be able to figure this out.
   <br> paulk-leonov: ping
   paulk-leonov: hverkuil, pong but about to leave for food
   hverkuil: OK, you can answer when you're back :-)
   <br> 1) can you test that last patch I mailed yesterday? If that solves the refcount_inc error then I can post reqv16 later today.
   <br> 2) I realized that v4l2-ctrls.c is missing an API to let drivers update a control in a request to give feedback to userspace. The only thing supported is that the driver can set the current 'HW' control and that will then be copied to the request when v4l2_ctrl_request_complete is called.
   <br> I'm not sure whether that is a bug or a feature :-)
   ttomov: <u>hverkuil</u>: compliance test logs for QComm CAMSS driver are 130KB and 50KB, any preference how to send th to the list - directly paste in email, email attachment, use any online paste service?
   hverkuil: <u>ttomov</u>: just email directly to me. Easiest is to use an email attachment.
   ttomov: <u>hverkuil</u>: no copy to linux-media?
   hverkuil: No, it's a bit too large for that. It's mainly for me anyway.
   ttomov: <u>hverkuil</u>: ok
   <br> I'll send a v2 of the patchset, because I've found a few bugs with the compliance test. And will send the test log to you.
   hverkuil: Why is it so large? Are you using the -v option?
   <br> Basically I would like to see two logs:
   <br> v4l2-compliance -M /dev/mediaX -v
   ttomov: no -v, but there are a lot of video nodes and subdev nodes and the test tests all of them (with -m option)
   hverkuil: This checks the media device only. I want -v to check whether the topology looks good, esp. since I am still improving the tests in that area.
   <br> And again with v4l2-compliance -m /dev/mediaX, this does this recursively. You don't want -v here since then the logs will become really long.
   ttomov: -M is small enough, I can add this in the cover-letter
   <br> and I can send -m to you directly
   hverkuil: Right.
   paulk-leonov: hverkuil, which patch is it for the refcount_inc error?
   <br> hverkuil, last patch I got from you yesterday was about ctrls, maybe I missed it?
   <br> hverkuil, 2) ah I hadn't thought about that
   <br> I guess I don't really have any strong opinion on that specific topic
   hverkuil: paulk-leonov: "New patch, hopefully fixes the refcount warnings"
   paulk-leonov: oh right
   <br> I did see it yesterday :)
   <br> hverkuil, it's running as we speak -- hasn't hit the crash so far :)
   <br> LGTM :)
   hverkuil: Ah, great.
   cristian_c: <u>hverkuil</u>: I've taken a look at www.linuxtv.org api and I've just found V4L2_BUF_FLAG_ERROR flag
   hverkuil: paulk-leonov: I'll post reqv16 in about an hour. Let me know if you get something before that.
   cristian_c: I've also tried the video grabber via the same github repository and I always get 'error' lines during video streaming
   <br> *also tried in archlinux
   paulk-leonov: hverkuil, sounds good!
   <br> hverkuil, will v16 also include "Patch for reqv15" and the associated fixup for scheduling the m2m run?
   hverkuil: yes
   <br> The only patch not there is "[PATCH] v4l2-ctrls: add v4l2_ctrl_request_hdl_find/put/ctrl_find functions"
   paulk-leonov: right, I didn't have time to get back to that
   hverkuil: I think that patch (assuming it works and does what it should do) should go in with your driver.
   paulk-leonov: sounds good to me
   <br> hverkuil, I'm hitting a strange issue here
   <br> hverkuil, it seems that buffer allocation gives a much bigger gap between the 2 nv12 planes addresses for the first capture buffer
   <br> here are the addresses: http://leonov.paulk.fr/collins/~paulk/paste/index.php?paste=0a703d6d20bca60db8cb15abebc6da07&amp;raw=true
   <br> and for some reason, our VPU produces garbage when that happens
   hverkuil: I can't imagine that that's related to the request API, am I right?
   paulk-leonov: probably not, indeed :)
   <br> hverkuil, I was wondering if that rings a bell to you, off-hand
   hverkuil: Anyway, I have no idea. I think how the memory is allocated depends on the fragmentation of memory. There isn't much you can do to influence that.
   tfiga: paulk-leonov: what is the address printed?
   <br> physical? IOVA?
   <br> which vb2 allocator does your driver use?
   paulk-leonov: that's physical with PHYS_OFFSET subtracted
   <br> allocator is dma contiguous
   <br> so CMA, if I'm not mistaken
   tfiga: no IOMMU attached?
   paulk-leonov: no IOMMU
   tfiga: is this NV12M or NV12?
   paulk-leonov: I use the NV12 pix fmt with 2 planes
   tfiga: uhm
   paulk-leonov: ahh
   tfiga: that would be NV12M I suppose
   <br> NV12 non-M has only 1 memory plane
   paulk-leonov: actually, we have the same issue with our custom format
   tfiga: basically, NV12M has both color planes separated into separate memory planes (=== buffers)
   paulk-leonov: and I don't see how the allocator is aware of the format
   tfiga: NV12 has both color planes stuffed into the same memory plane (=== buffer)
   paulk-leonov: it seems that the queue_setup qop only needs sizes and nplanes
   tfiga: yes
   paulk-leonov: so I'm probably abusing NV12, but I doubt it really matters here
   tfiga: generally if you have nplanes == 2, then you get 2 separate buffers
   <br> without any guarantees with location of one relatively to anohter
   <br> if your hardware requires both color planes of NV12 to be one after another, with current API, you can only use NV12
   <br> with 1 plane
   <br> I'm not sure how you're using it with 2 planes, though
   <br> couldn't get time to review your series yet :(
   paulk-leonov: honestly, it didn't really make any difference (1 or 2 planes) until now, and the hardware always takes distinct addresses for chroma and luma
   tfiga: paulk-leonov: ah, I see you have V4L2_PIX_FMT_MB32_NV12
   paulk-leonov: which makes it especially hard to understand what's wrong
   <br> tfiga, indeed
   tfiga: hmm
   <br> isn't it actually V4L2_PIX_FMT_MB32_NV16?
   <br> from the description, it looks like only width is subsampled
   paulk-leonov: tfiga, no, it's 2x2 sub-sampled
   tfiga: how comes that the tile is 64x32?
   paulk-leonov: tfiga, but plane size calculation is wrong in the latest version from the ml
   <br> tfiga, the tile is 32x32 for chroma and 64x16 for luma, actually
   tfiga: okay, that makes sense :)
   <br> thanks for clarifying
   paulk-leonov: or 32x64, rather
   <br> but yeah, you get the idea :)
   tfiga: hmm
   <br> not sure what gives you the garbage then
   <br> paulk-leonov: in v4, you don't seem to be setting alloc_devs[]
   <br> in queue_setup
   paulk-leonov: oh that's right
   <br> good catch tfiga!
   tfiga: paulk-leonov: I might be missing something, but from what I see, it would make vb2 dma contig pass NULL to dma_alloc*()
   <br> as dev
   <br> so probably your buffers wouldn't be allocated as needed for your device
   paulk-leonov: looks like lots of drivers are not setting alloc_devs
   <br> thanks for the help :)
   <br> I think this is really just an oddity of the hardware
   tfiga: hmm
   <br> then I might be missing some implicit override from NULL to some dev somewhere
   paulk-leonov: at least the allocation does seem to happen
   tfiga: yeah, technically if you pass NULL as dev to dma_alloc_coherent() it would still give you something
   <br> aha, found it
   <br> https://elixir.bootlin.com/linux/v4.18-rc3/source/drivers/media/common/videobuf2/videobuf2-core.c#L211
   <br> paulk-leonov: okay, out of ideas and stamina, so going to call it a day
   <br> good luck :)
   <br> I'll try to get some time to review your series
   paulk-leonov: tfiga, thanks!
   <br> I've conducted various tests and it seems that too big a gap between the planes causes the issues
   <br> issue*
   <br> I can't really make sense of that without knowing the VPU hw design
   <br> so I'll just take it as a constraint
   <br> and go with NV12M from now on
   <br> (I just hope it won't mess up dmabuf import)
   <br> (on the drm side)
   ndufresne: tfiga, paulk-leonov: if you use NV12, you need to be able to avoid arbitrary padding between planes though, as the offset between planes is not expose in V4L2, userspace will have to guess
   <br> long term, we should be able to have 1 allocation to support NV12M, there is no reason to prevent this
   paulk-leonov: ndufresne, that's when there is only one plane (NV12M), right?
   ndufresne: only 1 plane == NV12, 2 planes NV12M, and yes, that's when there is only 1 plane
   paulk-leonov: ahh I was mixing them up
   <br> so what I need here is NV12 after all
   ndufresne: so you could not do something weird like, [plane Y] [random padding] [plane UV]
   <br> paulk-leonov, NV12 seems like the way forward
   paulk-leonov: ok I see
   ndufresne: it does not matter later, since when you prime, you can pass the same dmabuf FD with different offset, same when you import in GL and similar
   paulk-leonov: nice, I was about to ask about that next
   <br> so I can do one export on the v4l2 side and 2 imports on DRM
   <br> but then I need my userspace to guess what offset to apply
   ndufresne: 2 import on DRM would be EGL path ?
   <br> or you mean pass two planes on DRM ? (if second, yes)
   paulk-leonov: native DRM or EGL, I guess
   <br> yes, pass two planes
   <br> since NV12 on DRM is bi-planar
   <br> is there no way I can get the offset to chroma when using a single v4l2 plane?
   <br> it kinda sucks to have userspace calculate it
   ndufresne: you need userspace to calculate it
   <br> bi-planar is a Chrome OS team term btw, the right term is semi-planar
   <br> for semi planar, there is only 1 offset, which is bytesperline * height
   <br> where height is padded height reported by S_FMT
   <br> (un contrast with display height, reported by G_SELECTION)
   <br> for planar formats, like YUV420, the offsets depends on the subsampling
   paulk-leonov: so S_FMT is supposed to align height?
   ndufresne: yes, S_FMT speaks in term of padded width/height, aka allocation size
   paulk-leonov: neat
   ndufresne: so basically you give S_FMT the display width/height and it will pad it with appropriate HW constraints
   paulk-leonov: sofar we're abstracting all the height alignment in the driver
   <br> does S_FMT also report modified values to userspace or does userspace need to issue G_FMT?
   ndufresne: I bet 1080 -&gt; 1088, like 99% of the decoders (inclusdng SW decoders)
   <br> paulk-leonov, it must update the structure before it returns yes
   paulk-leonov: but that's only for height, I guess width must remain untouched and bytesperline holds the aligned width
   ndufresne: only applies to RAW formats, if it's H264 as an example, the width/height will be left as-is, it does not matters on decoders in fact, since it's inside the bitstream
   paulk-leonov: is that right?
   <br> I see
   <br> note that on stateless VPUs we don't get it from the bitstream ;)
   ndufresne: right, for width, it's generally left untouched, but sometimes it's being padded too, both ways are fine and work
   paulk-leonov: I actually removed it from the controls in mpeg2 for the next version, since it's redundant
   ndufresne: I think scalers and color converters will generally pad both for simplicity
   paulk-leonov: ok
   <br> thanks for these details :)
   hverkuil: <u>tfiga</u>: gnurou: paulk-leonov: posted v16 of the Request API.
   ***: benjiG has left
   hverkuil: paulk-leonov: quick question: is the HW your testing the cedrus driver on a single CPU core system?
   <br> <u>ttomov</u>: I'm looking at the warnings in the v4l2-compliance output: I think the crop code in ov5645.c is bogus. It doesn't seem to do anything.
   <br> Removing the crop code in that driver will likely fix those remaining warnings.
   larsc: removing all the code will definitely fix the remaining warnings
   paulk-leonov: hverkuil, no actually we have always tested on SMP multi-cores
   <br> either 2 (A20) or 4 (A33, H3)
   <br> ah nevermind, the A13 (also tested) is 1 core
   ***: jmondi has quit IRC (Ping timeout: 255 seconds)