broonie: quick question about the regulator API. the current implementation of regulator_bulk_{enable,disable,free} are effectively no-op if num_consumers is 0, and they won't dereference consumers in that case (which can thus safely be NULL). is that an API design guarantee that drivers can rely on ?
sailus: hi, I'm wondering: what's the right way to grab the mbus_code of a subdev from a camera controller perspective
sun6i-csi does this in link_validate because it's called just before streamon, but this looks like abusing link_validate
my bet would be calling v4l2_subdev_call get_fmt at streamon time, would that be better?
ribalda: paulk-leonov: The user is expected to set it.
sailus: on the camera controller side, there's no subdev so no mbus_format to set
just a v4l2 device with a media entity
Why not?
however, there's a pixfmt that is set with s_fmt_vid_cap
sailus: heh I couldn't say why, I didn't write the driver, but I also don't see why we need a subdev for the camera controller since we already have a media entity via the v4l2 device
so we can link to actual subdevs
paulk-leonov: the links between subdevs can be validated more or less automatically. for the links between subdevs and video nodes, validation is manual, and usually performed in either the streamon handler, or the vb2 start_streaming handler
there are quite a few examples in-tree
pinchartl: yes that's correct
and sun6i-csi does implement a custom link_validate
I'm just surprised it uses this call to grab mbus_format
because it seems out of scope of link validation
Is sun6i-csi driver really implemented without a sub-device?
sailus: yes definitely
sun4i-csi has one, but I think it would be cleaner to get rid of it
since it doesn't describe anything that actually exists
Isn't there a CSI-2 receiver?
Or a parallel one, at least?
there's a parallel one, yes
so that would be a legit subdev to include?
It's there, isn't it?
right right, so I was wrong, it does describe something that exists :)
but I guess our topology's wrong though, because we expose a single port
If that sub-device isn't there, the user will have hard time to figure out which media bus codes are actually supported there.
yeah I think that currently, it just takes any mbus_code in
from the remote subdev
and doesn't expose what it supports at all
so I think we would need to change the fwnode description
to distinguish between what's connected to the parallel bridge and what's connected to the DMA engine
paulk-leonov: note that the media graph and OF graph don't need to match 1:1
pinchartl: yeah that's what I'm realizing just now :)
you can have a single DT node to represent the sun6i-csi, and create a subdev and a video node
but we'd still need two ports, right?
pinchartl: I can't think why we'd dereference the pointer if the count is zero so yes, I'd say so.
pinchartl: If the count is 0 the pointer must be invalid.
paulk-leonov: How many ports does the device really have?
broonie: thanks
another question. I have a DT node that corresponds to an I2C device, and subnodes that contain regulators. there's no API in the regulator framework to get regulators from a DT node, they can only be retrieved from a struct device
so my driver creates struct device insteances that are associated with the subnodes
sailus: the controller, as in the entity described by the dt node that is composed of a DMA engine an a parallel bridge, can take data in either from the parallel bridge input or from the DMA engine input directly
the latter being used for MIPI CSI-2
I wonder if that's the right way to go, or if a regulator_get_of_node() (or similar) should be exposed
sailus: so in my opinion that's two distinct ports because data flow is different in these two cases
internally the parallel bridge is also connected to the DMA input, but we can't represent that via fwnode (although we can in the media graph)
Ah. Is the DMA in a different device?
sailus: no, it's the same device as the parallel brdige
(in terms of struct device and dt node)
Ah.
Interesting.
So the two are different inputs for DMA then?
I guess two ports should be used in that case.
Most such devices are simple CSI-2 to parallel bridges.
So there's just one port.
right, in our case, the MIPI CSI-2 controller (which is a separate device) doesn't bridge to parallel but to some internal FIFO going to DMA input
sailus: also, we'd probably need a media bus type to describe this kind of internal FIFO-ish link
Could you use the parallel formats?
There probably is one that would fit already.
yes, CSI-2 receivers usually output to a parallel bus
note that the media bus format doesn't need to exactly match the hardware format
the media graph is a logical view of the device
if a link is internal, using a format that conceptually describes the data is fine
as all drivers involved are developed together, there's no interoperability concern
for instance, unpacking YUV 4:2:2 to YUYV8_1X16 at the output of the CSI-2 receiver is a good match
the actual hardware bus may carry data on a 32-bit bus with two pixels per clock cycle
if the two ends of the links were separate devices (for instance an external camera sensor and a parallel receiver) then the exact format should be described, as the drivers are developed separately and can be mixed-and-matched
but when it's internal, it's less of an issue (note however that if one of the two ends of an internal link is an IP core from a third-party that can be reused in different SoCs, handled by a separate driver, then an accurate description of the real format is best)
well I could use parallel to describe the DMA <-> MIPI CSI-2 receiver link
but then what to describe the DMA <-> parallel link?
I though we didn't want bridges that expose the same mbus format as input and output ;)
note that I'm talking about the media bus type, not the media bus format code
media bus type ?
https://elixir.bootlin.com/linux/latest/source/include/media/v4l2-mediabus.h#L71
are you talking about the bus-type DT property ?
yes, it matches to that
(granted, it's not super-useful)
but I'd rather avoid describing both ports of the device as parallel if one is the actual parallel bridge input while the other is FIFO-ish interface to the DMA engine
how is this all split in IP cores that need to be described in DT ?
do you have a CSI-2 receiver IP core and a parallel+DMA engine IP core ?
pinchartl: yes exactly that
then for the bus between the CSI-2 receiver than the other IP core I think you can skip bus-type as it's an internal bus and you shouldn't need to parse properties from DT. sailus is that correct ?
There's no need for bus-type property if the type is not configurable.
understood
okay so while we're here discussing this...
there's also the ISP matter
the ISP has its own DMA engine for output and can take input either from an input DMA engine or from the camera interface block
in which case, the camera interface block isn't using its DMA engine
but it's still required to be enabled to follow data from e.g. the mipi csi-2 bridge to the ISP
so if I just describe the camera interface as parallel+DMA subdevs, neither would be in use
although we still need things to be up and running
so I'm thinking, maybe we'd need a third "formatter" subdev, that could be connected to the ISP's non-DMA input?
and following to that, the nature of the ISP v4l2 device would also need to change between m2m and capture depending on whether it's taking input from the camera interface or from DMA
I don't think that's possible at runtime, right?
switching between V4L2 M2M and online processing is indeed not possible
the ISP is also a separate IP core, right ?
okay but we could still look up whether the fwnode input port of the isp is connected or not and decide
at probe time
so it would be a dt-based decision
pinchartl: yes it's a separate IP core
so what I would recommend is
in DT, 3 nodes
CSI-2 receiver, with one input port and one output port
"camera interface" (named CSI by allwinner I think ?), with three ports, one input for the parallel interface, one input connected to the CSI-2 output, one output connected to the ISP input
ISP, with one input port, connected to the CSI output
in the media graph, you would have
CSI-2 receiver subdev, two pads, sink and source
CSI subdev, four pads, one sink for the parallel sensors, one sink connected to the CSI-2 receiver source, one source connected to the DMA engine video node, and one source connected to the ISP
CSI DMA engine video node, one sink pad, connected to the corresponding CSI source
ISP subdev, 3 pads, one sink connected to the CSI source, one sink connected to the ISP input DMA engine, one source connected to the ISP output DMA engine
ISP input DMA engine, a video node with a source pad
ISP output DMA engine, a video node with a sink pad
sounds great :)
wait :-)
a few questions
can the output of the CSI be routed to the CSI DMA engine and the ISP at the same time ?
I'm pretty sure it can't
although I haven't tested it, there's no reg field to configure the output path of CSI
and whenever the ISP's input is set to CSI, it goes that way
so I don't see how we could select both at the same time
then in that case you should have a single source pad for the CSI, connected to both the DMA engine and the ISP, and enforce that only one link is enabled at a time
right
same thing for the ISP input actually, there should be a single pad, connected to both the ISP input DMA engine and the CSI source
again, only one link should be enabled at a time
pinchartl: and for the ISP you meant two separate video nodes?
or that would be the same one
yes two separate video nodes
granted, we'd need one media entity per DMA engine
is there a driver for the ISP ?
not yet
will there be one ? :-)
but I would expect userspace to look for a single M2M device for the DMA to DMA use case though
pinchartl: odds are in our favor
so yeah eventually there'll be one
\o/
(we need that at least for debayering and we have all the info to write it)
depending on how complex the ISP is, you may even need more than one subdev to represent it
the M2M API isn't recommended for ISPs
mhh it has lots of stages indeed
oh ok
but e.g. rockchip's RGA is a kind of ISP done via M2M ;)
only a "kind of" ISP :-)
there's no clear cut, but if the ISP can operate in online mode (with a live input from the camera sensor) then the M2M API isn't a good fit, even if it supports an offline mode too
and if you're concerned about the additional complexity for userspace
there's libcamera :-)
understood !
yeah and anyway it's good to allow switching inputs via media graph instead of statically via dt only
well thanks a lot for the talk pinchartl, was very helpful :)
sailus thanks as well!
you're welcome
mhh one thing though, with a single subdev for CSI and two sink pads, we wouldn't be able to represent that only one can be connected at a time, right?
so a single pad may be a better fit after all?
and the parallel bridge would just not be represented
alternatively we can represent the CSI formatter and the parallel bridge as two subdevs and one sink port on the formatter that connects either to the parallel bridge or to mipi csi-2
you're right, a single sink pad is the right solution
we could indeed represent the parallel bridge as a subdev too
if there's something to configure there, it would make sense
if there's nothing to configure (from a userspace point of view) then we can ommit it to make the media graph simpler
s/ommit/omit/
there are registers dedicated to the bridge's configuration but they can be handled as well with a single subdev since we'd know whether the connected source is parallel or not
pinchartl: one last thing, you mentionned the CSI dt node with 2 sink ports (one for parallel, one for mipi csi-2), I'm not sure whether the same argument that it couldn't represent the impossibility to connect both at the same time applies or whether it's enough that the subdev will restrict to one port
in practice the controller represented by DT can globally take a single input at a time, even if it's not exactly the same input, so I'd be tempted to say one port would be best
given that bus-type exists in dt for that purpose
although in that case we'd need to describe a bus-type for the "internal link" from which mipi csi-2 would come from
sailus: would you agree to add an internal media bus type to handle this case?
V4L2_MBUS_INTERNAL or so
paulk-leonov: from a hardware point of view, the CSI has two inputs, one for parallel sensors, one for CSI-2
and it has an internal mux
so I think it makes sense to have two inputs ports in DT to model that
the physical pins for the parallel input are distinct from the internal port for the CSI-2 input
okay fair enough
then I don't have to care about bus type
mhh one of the controllers on a specific SoC is not tied to a set of pins though
I mean, the parallel input is not tied to a set of pins
so it cannot be used
in this case I'd be tempted to not represent the parallel input port in dt
in other socs there's no mipi csi-2 bridge
so there's a case where port@0 would be parallel and a case where port@0 would be mipi csi-2
actually that happens on the same soc
(we have the same controller duplicated, one for the mipi csi-2 bridge and one for parallel)
while on another soc with the same CSI controller, there's one instance than can be muxed to both inputs
I don't see any other way than explicit bus-types to solve this
paulk-leonov: port numbers don't have to be consecutive
you can hardware port@0 = parallel, port@1 = CSI-2, port@2 = output, and omit port@0 or port@1 when not implemented
s/hardware/hardcode/
ah nice
I didn't know
jernej: I'll drop the LEVELs from cedrus. Perhaps with a comment in the code explaining why it's not there.
And hopefully we can have a new series today for hverkuil to.. merge :) :-)
ezequielg: btw I've been following your work with a distance (thanks for it), but as far as I could see, we don't have multi-slice support yet right?
I thought this should be in before destaging
but it may also be that this can be added later with only additions
still it would probably be good to be 100% sure, right?
no need to follow the work with a distance :-) please follow more closely.
testing on cedrus would be nice, testing with a few samples. i haven't done that.
ezequielg: I'm good with that
jernej: BTW, any plans to test on Cedrus before/after/at some point, with the new uapi?
(changes are minor, but still)
paulk-leonov: everything works with current API, also multi-slice frames, but maybe not as efficient as it could be if all slices would be send in same request
ezequielg: at one point, sure, but I can't tell you when
paulk-leonov: also, do you have any plans to work on HEVC destaging?
I'm not comfortable with having any uAPI in staging. As the whole concept is undefined, as mchehab recently said.
usually kwiboo updates ffmpeg patches for new UAPI, but if he has no time, I can take a look
ezequielg: didn't you say once that someone at Collabora is working on HEVC API?
I am working on G2 HEVC.
so eventually, if time permits, and if nobody steps up, I will destage it.
but I'd love to see anyone taking this off my plate, as it's not a picnic.
now you see why I didn't push any further after my first attempt for some new HEVC controls
Are there times when, after a successful VIDIOC_DQBUF ioctl (i.e ioctl returns >=0), the buffer or its planes aren't filled in properly?
I have a case here where after DQBUF, the plane's bytesused is 40889472 according to GDB (that's 40 megs for a 640x416 NV12 image)
mort: If the returned buffer has V4L2_BUF_FLAG_ERROR set in flags, then the contents of the buffer is invalid for some reason. If this flag is not set, but the contents is bad, then that might be a driver bug.
and ioctl might not return <0 even though that error flag is set?
right. It will just return 0 in that case.
Interesting. Thanks
well, I have some bugs to fix in various v4l2-related code then, heh
This can happen if e.g. a USB transfer fails, or there is a DMA failure of some kind. It's rare, but these things happen. It's a transient error, so it is just reported with that flag.
well, that flag isn't set in my case
mort: hey there.
40 megs is too much, no?
640x416x(3/2)=400kib
you sure that bytesused is not bogus?
for reference, 1080p YUY2 is 4 megs, so...
that's what I'm saying, it's way bigger than it should be
what driver is this?
qualcom camss
do you know where in the driver's code it's setting the bytesused?
no
mort: what's the output of v4l2-ctl -V? (get the video format)
https://p.mort.coffee/oNI
maybe using sizeimage is more appropriate than bytesused
OK, that looks good. This seems a driver bug, bytesused is definitely the value to use.
I'll try to just hard code the expected size
is this the camss driver from mainline or from a qualcomm BSP?
drivers/media/platform/qcom/camss/camss-video.c:                vb2_set_plane_payload(vb, i, format->plane_fmt[i].sizeimage);
to be 100% precise, it's this: https://github.com/varigit/DART-SD410-kernel/tree/release/kernel-17.06.1
Ah, 4.9.39. You're on your own.
yeeah
upgrading has been on the to-do list for ages now, but upgrading kind of just breaks everything and we haven't taken the time to investigate why because other things have to be done too
well, actually it sets the same field in video_buf_prepare(). I'd debug that, see what it is actually set to.
mort: you are reading the right value? It's a multiplanar format, so bytesused is in buf.m.planes[0].bytesused.
`struct v4l2_plane plane = {0}; info.m.planes = &plane; info.length = 1;` and then reading `plane.bytesused`
OK, just checking :-)
fwiw, the data in the buffer isn't right either, so the real issue might be someplace else entirely
none of the ioctl fail though, and it _usually_ works fine, just not every time
ezequielg: I tested your H264 destaging v3 series on Cedrus/ffmpeg/Kodi stack and it generally works, except that patch 7 rejects controls for some samples and thus they are SW decoded
I didn't yet dive into details why they are rejected
can you test http://jernej.libreelec.tv/videos/h264/20160425_085914.mp4 on your stack?
let me see.
worked here.
do you know _which_ frame it rejects?
no, but I added debug output and following criteria is violated: log2_max_pic_order_cnt_lsb_minus4 > 12
in the SPS?
that element is 0 in YUView.
yes
this is ffmpeg implementation: .log2_max_pic_order_cnt_lsb_minus4 = sps->log2_max_poc_lsb - 4,
should be ok
let me check other videos
what value do you have?
another video same issue
I don't know, I have to add some more output
if you have a libavcodec repo I can quickly look at it.
keep in mind log2_max_pic_order_cnt_lsb_minus4 is a syntax element. no math is needed.
(unless you added the 4)
third video, same issue
yeah, ffmpeg usually immediately adds value in name (in this case 4) after the read
let me check actual value
hm, wonder if you don't have some other issue.
e.g. not passing the control correctly.
because I see libavcodec doing the same 0,12 check.
see "log2_max_frame_num_minus4 out of range (0-12): %d\n"
(thanks a lot for testing, btw)
it's 252, so probably underflow?
if you want a quick review, i can offer that :)
(of the porting code)
you could have some minus one somehwere in the control array passing and it would cause this kind of thing.
coming to think about it, it's nice to have this validate code in the kernel.
you can take a look kwiboo's ffmpeg repo
I didn't do anything besides remove old h264 controls and rename few defines
this part is the same
default branch?
ok, but you have updated the videodev2.h I suppose?
v4l2-request-hwaccel-4.3
not in ffmpeg, that's updated through your patch series
I used LibreELEC build system, so ffmpeg can use kernel headers directly
hm, have you noticed that the SPS and PPS are set in v4l2_request_h264_init? and then again _start_frame.\
yes
I wonder if the H264Context has SPS values at _init time.
oh, that might be it...
they are set in init so driver can validate everything and tell if it supports decoding that frame or not
s/frame/stream/
that's in documentation...
anyway, sps should be already properly parsed at that time
would you mind removing the SPS/PPS real quick from _init and check?
hm... I don't think that would completely solve the problem
fwiw, neither nvdec nor vaapi deal with SPS or PPS at .init time.
we put in documentation :)
from what I can see, log2_max_frame_num_minus4 is only valid if poc_type == 1
sorry, if == 0
so validation would need to ignore it in case poc_type != 0 or even force it to 0
ah, that can be it.
indeed, we have to set it to zero.
I changed code to ".log2_max_pic_order_cnt_lsb_minus4 = (sps->poc_type == 0) ? sps->log2_max_poc_lsb - 4 : 0," in ffmpeg and it works now
but it would be nice to fix it also in kernel
not sure if any other field has same issue
nice catch.