[03:23:33] Hi, all, I'm currently reverse engineering the register interface for a hardware video decoder, but it's been difficult because I don't really know the terminology and what kind of elements would be in hardware. Are there any good resources on learning this stuff? Or should I just go ahead and read the relevant video coding standards?
[03:32:25] cyrozap: chances are that it may be something already supported by Linux, just integrated by another SoC vendor
[03:32:25] so I'd recommend checking existing drivers
[03:32:47] and also the relevant V4L2 interfaces, as they expose pretty much the information that the hardware provides/consumes
[03:32:49] https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-mem2mem.html
[03:44:23] tfiga: I should clarify--this isn't a decoder on a SoC, it's a decoder on an external device that connects over PCIe. Currently, this decoder runs firmware that does all the bitstream parsing work, but I want to write my own custom firmware to expose the internal hardware interfaces to the host PC. Thank you for the link to the V4L2 docs, though--they seem helpful.
[03:45:04] I'd still check the existing drivers, as there is a limited number of video codec IP block providers :)
[03:45:12] and that PCI device likely contains a SoC inside
[03:55:36] tfiga: It's the Broadcom Crystal HD (blast from the past, I know :P). It seems to be based on some video decoder chips Broadcom made back in ~2004/2005 (it might even just be a die shrink), and runs its firmware on an ancient 32-bit only ARC core (modern ARCompact and ARCv2 are essentially a completely different ISA). I'll check the drivers to see if anything looks similar, but I'm really not
[03:55:42] expecting much :)
[03:55:54] This is the directory, right? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/media/platform
[03:56:04] yeah, also please check drivers/staging/media
[03:56:30] Got it, thanks!
[03:57:08] cyrozap: have you seen https://en.wikipedia.org/wiki/Broadcom_Crystal_HD#Linux ?
tfiga: Yup, I know there's already a downstream driver for the chip, and it actually details a lot of the registers already so it's been a big help.
hi
so I'm working on a driver for the Allwinner ISP
many things are sorted out but one: we currently have a sun6i-isp which registers its v4l2_device and media_device
however, I need to use the sun6i-csi subdev from the sun6i-isp driver
and sun6i-isp has its own v4l2_device and media_device
so the only option I'm seeing to only have one set of v4l2/media_device is to use the component framework with those two drivers under it
and a shared data structure
the downside is that it'll require an extra dt node for the component driver to load
which will result in old dts no longer probing
is there a better way to handle this kind of scenario?
dwlsalmeida: I plan to post a new version of the dynamic array code and also support for read-only requests on top of the very latest media_tree master branch today or Monday. Now that the v4l2-ctrls.c split has been merged, it is a lot cleaner. You can use that as the base for your work (provided you can run on the latest kernel, of course).
paulk: the components framework kind of sucks :-S
what's the hardware architecture ?
pinchartl: heh I've heard that before
pinchartl: you mean the SoC?
or the general picture?
the CSI-2 receiver and the ISP (and possibly other IP cores)
how are they connected at the hardware level ?
"however, I need to use the sun6i-csi subdev from the sun6i-isp driver" -> why?
any blck diagram ?
the flow is sensor -> MIPI CSI-2 bridge -> CSI bridge -> ISP
not really any block diagram sorry
what's the CSI bridge ?
it's the basic camera interface that can be used standalone
it has a DMA engine
but when using the ISP, its DMA engine is not used
ok
however it still needs to be enabled
because it muxes input sources
so the CSI bridge will be the one selecting the MIPI CSI-2 bridge as input
when you said
"we currently have a sun6i-isp which registers its v4l2_device and media_device"
did you mean sun6i-csi ?
oh right, sorry
mripard: we do need to start the sun6i-csi block to get data from the ISP
I'd move the media_device and v4l2_device from sun6i-csi to sun6i-isp
although
how is it all described in DT today ?
today there's a node for sun6i-csi and a node for the mipi csi-2 bridge
are they connected using ports
?
with fwnode graph between them
yes
good
so I suppose you'll add an outport port for the sun6i-csi, connected to the sun6i-isp ?
yes
but that's not present in DT today, right ?
correct
bit of a shame that one :S it prevents moving the media device to the sun6i-isp
are those IP cores reused on other SoCs ?
only on Allwinner SoCs as far as I know
connected in the same way ? or could we have, for instance, in a different SoCs, two CSI-2-RX + sun6i-csi pairs, both of them connected to the same ISP instance ?
well the ISP can actually take input from multiple sun6i-csi instances yes
for example on a given soc there's a sun6i-csi instance connected to a mipi csi-2 bridge and one for parallel
and the ISP can take input from either one of them
so in practice I was thinking 2 input fwnode ports for the isp
paulk: is there an ISP on all the SoCs using the sun6i-csi IP?
mripard: let me check
so here's how I would do it
in the sun6i-csi driver
if there's no output port in DT
don't change anything
if there's an output port in DT
don't register a media device and v4l2 device, but register and async subdev
mripard: I think A64 doesn't have an ISP
the ISP driver will register the media device and v4l2 device in that case
that way you make the ISP the master component when present, but don't break backward compatibility with existing DT
pinchartl: right I was about to suggest something like that
so that's okay?
and it will still work in systems that don't have an ISP
it's OK for me :-)
side question, do you have documentation on the ISP ?
do you know how to program it ?
I mean, is there a chance that it breaks if the isp node is status = "disabled" ?
or if the driver is not enabled, etc
if the driver isn't enabled it won't work, but I don't think it's an issue
okay
if the status is disabled, we can support that
by checking in the sun6i-csi driver if the remote DT node is enabled or not
okay fair enough
seems like a fair middle ground
thanks a lot :)
you're welcome
pinchartl: I know enough about the ISP yes
what are your plans for the userspace part ?
not docs though
pinchartl: I think the initial submission will mostly be debayering
and maybe scaling
so not much to care about for userspace
but it also does 3A
and stuff like histogram
so in the future it'll probably be a candidate for libcamera or such
the API needs to be designed with the whole ISP in mind, so it's important to consider all the features to ensure we won't break backward compatibility in the future
but honestly bootlin doesn't have plans to work on it for nowe
pinchartl: definitely
pinchartl: many of the blocks take very specific parameters
for example there's a denoise filter which takes a table of parameters
that's a common situation in ISPs
there's also LUTs for dead pixel correction
the standard practice these days is to put all the parameters in a buffer, with a metadata video node
I see
it depends a bit on the ISP architecture so I can't comment much
there's no block diagram from allwinner, but based on your knowledge of the hardware, can you make one ?
so I have a good idea of the different blocks in it, it's just unclear what the processing order is
although I could take a look at newer generations
even just a list of blocks would be a good start
which do have a block diagram in the docs
that's useful too
it would be nice to test statistics support early
does the hardware support memory-to-memory operation with the ISP, or is it limited to inline processing of data from the CSI bridge ?
pinchartl: that's a good question
pinchartl: we do see registers to take input from DRAM
but I was unable to make them work
pinchartl: I did contact Allwinner about them and they said they no longer have the info of how to make it work
lovely :-)
do you know if it's an IP core they have developed, or if they have licensed it ?
it's also quite possible that the clocks/dma engine for it are broken/missing
pinchartl: that is unclear too but I would tend to believe they did it in house
pinchartl: it's been evolving quite a lot with allwinner platforms
with a first generation that was tied to the CSI controller for sun4i-a10/sun7i-a20
then they made it standalone (sun6i)
and now they've reworked it a bit for newer generations (it's called isp500, isp520 and isp522)
ok
I would also recommend looking at libcamera support early on ;-)
so back to DRAM, I think I'll work with the assumption that this mode is not available
but if someone ever manages to make it work, I guess it won't change the fundamental nature of the driver
pinchartl: yeah that would definitely be the goal, but the scope that Bootlin has for now is more or less just debayering
even for that
considering backward compatibility
you'll likely have to implement the input parameters buffer mechanism right away
here's the list of modules btw https://paste.ack.tf/ceb73c
and you don't want to force applications to deal with that
thanks
seems fairly standard
pinchartl: I was thinking of using some v4l2 controls for easy stuff
that won't scale when you'll need to add support for the rest
it's a dead end
I see, so no controls at all?
I agree that it won't scale, but I was wondering if a mixed approach could work
with a mixed approach you'll have to synchronize the controls with the parameters buffers, and thus use the request API
it will increase the complexity, both in kernelspace and in userspace
yeah better forget about that
but then I guess that the parameter buffer must include data for all the modules
from the start
so I better understand why you emphasised that :)
yep :-)
you can look at the rkisp1 driver for an example of how it can be done
yep, seems like I will end up doing something very similar
hverkuil ok hans :) thanks for the tip
mchehab: I don't think IRC logger works here
latest log is from 1. June
dwlsalmeida: ok, I'll take a look
jernej thanks! :)
mchehab: while we're at it: the links to the git repos here: https://linuxtv.org/repo/ do not work. They all contain '/cgit.cgi', which should be removed.
dwlsalmeida: I see some bugs just from going over the code
mchehab: thanks! I stumbled across that today, nice to have fixed.
mchehab: if the logger can join the channel I suppose it will work
dwlsalmeida: for example: VE_H264_VLD_ADDR_FIRST | VE_H264_VLD_ADDR_VALID |VE_H264_VLD_ADDR_LAST must always be kept together
because all these 3 flags together tells that you have whole slice
and VE_H264_VLD_OFFSET should always be set to 0
offset adjustment is done by skip function
because of HW quirk
some videos fail if you set offset directly to wanted position
instead, you have to set offset to 0 and execute skip command, so internal pointer points to wanted offset
dwlsalmeida: do you really need new control for array? size was present before but dropped during H264 control refinement
offset is imho also not needed
first slice in array should have sufficient header_bit_size set
jernej I was told not to add anything to the current h264 controls since they're stable now
jernej I missed your comment on the skip function in the code, you've warned against the very thing I tried to do heheh
jernej the one piece that is missing to me is: where do we keep track of our current position in the capture buffer? Since I am calling setup() and trigger() in a loop now I assume this will overwrite the decoded data at every iteration
I recommend that you split setup in two
one for "do it once"
and one for do it for every slice (probably majority)
in "do it once" function, you set up buffer
and every time slice is decoded, HW will keep pointer in buffer
you just have to make sure header_bit_size is each slice is exactly right
and I don't think you have to change any meaning in current controls for array of slices
just mark V4L2_CID_STATELESS_H264_SLICE_PARAMS as array
*as dynamic array
from driver perspective, it doesn't matter if it contains only one or more elements
code should be generic enough
I also don't think your current approach calling setup() and trigger() in a loop would work
ISR releases job and you don't want to do that
you have to setup new decoding from ISR (imo not the best option) or suppress call to v4l2_m2m_buf_done_and_job_finish() in ISR if there are more slices to decode in current job
albeit, first option is easier, because trigger() doesn't wait on interrupt
for second option you would need some kind of semaphore
> suppress call to v4l2_m2m_buf_done_and_job_finish() in ISR if there are more slices to decode in current job
that's what I did, looks a bit messy though ngl. I totally skipped on the semaphore part though
anyway, it will be hard for you to do it right if you don't have any HW to test on :)
I have a board on the way, thanks for the tips, I really appreciated it :)
yeah, that will be a problem for a while, just change it to gmail
what is the status regarding hevc, btw?
anything merged?
jernej: hans sent the benjamin's g2 driver -- and today a new series adding some features was sent
hverkuil: I'm reviewing Alex Bee's series, so please don't merge anything :)
rockchip has this hantro-ish ip core which we made part of the hantro driver, but ndufresne and i are seriously thinking we could move that to its own driver.
it's getting messier to have hantro as a unifying driver.
ezequielg: did you have a chance to check my HEVC patch?
dependent slice segments?
yes
didn't see it in detail, tbh. But John seemed happy so ;)
I'm not sure if I should re-submit it with new e-mail