[03:23:33] Hi, all, I'm currently reverse engineering the register interface for a hardware video decoder, but it's been difficult because I don't really know the terminology and what kind of elements would be in hardware. Are there any good resources on learning this stuff? Or should I just go ahead and read the relevant video coding standards? [03:32:25] cyrozap: chances are that it may be something already supported by Linux, just integrated by another SoC vendor [03:32:25] so I'd recommend checking existing drivers [03:32:47] and also the relevant V4L2 interfaces, as they expose pretty much the information that the hardware provides/consumes [03:32:49] https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-mem2mem.html [03:44:23] tfiga: I should clarify--this isn't a decoder on a SoC, it's a decoder on an external device that connects over PCIe. Currently, this decoder runs firmware that does all the bitstream parsing work, but I want to write my own custom firmware to expose the internal hardware interfaces to the host PC. Thank you for the link to the V4L2 docs, though--they seem helpful. [03:45:04] I'd still check the existing drivers, as there is a limited number of video codec IP block providers :) [03:45:12] and that PCI device likely contains a SoC inside [03:55:36] tfiga: It's the Broadcom Crystal HD (blast from the past, I know :P). It seems to be based on some video decoder chips Broadcom made back in ~2004/2005 (it might even just be a die shrink), and runs its firmware on an ancient 32-bit only ARC core (modern ARCompact and ARCv2 are essentially a completely different ISA). I'll check the drivers to see if anything looks similar, but I'm really not [03:55:42] expecting much :) [03:55:54] This is the directory, right? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/media/platform [03:56:04] yeah, also please check drivers/staging/media [03:56:30] Got it, thanks! [03:57:08] cyrozap: have you seen https://en.wikipedia.org/wiki/Broadcom_Crystal_HD#Linux ? tfiga: Yup, I know there's already a downstream driver for the chip, and it actually details a lot of the registers already so it's been a big help. hi so I'm working on a driver for the Allwinner ISP many things are sorted out but one: we currently have a sun6i-isp which registers its v4l2_device and media_device however, I need to use the sun6i-csi subdev from the sun6i-isp driver and sun6i-isp has its own v4l2_device and media_device so the only option I'm seeing to only have one set of v4l2/media_device is to use the component framework with those two drivers under it and a shared data structure the downside is that it'll require an extra dt node for the component driver to load which will result in old dts no longer probing is there a better way to handle this kind of scenario? dwlsalmeida: I plan to post a new version of the dynamic array code and also support for read-only requests on top of the very latest media_tree master branch today or Monday. Now that the v4l2-ctrls.c split has been merged, it is a lot cleaner. You can use that as the base for your work (provided you can run on the latest kernel, of course). paulk: the components framework kind of sucks :-S what's the hardware architecture ? pinchartl: heh I've heard that before pinchartl: you mean the SoC? or the general picture? the CSI-2 receiver and the ISP (and possibly other IP cores) how are they connected at the hardware level ? "however, I need to use the sun6i-csi subdev from the sun6i-isp driver" -> why? any blck diagram ? the flow is sensor -> MIPI CSI-2 bridge -> CSI bridge -> ISP not really any block diagram sorry what's the CSI bridge ? it's the basic camera interface that can be used standalone it has a DMA engine but when using the ISP, its DMA engine is not used ok however it still needs to be enabled because it muxes input sources so the CSI bridge will be the one selecting the MIPI CSI-2 bridge as input when you said "we currently have a sun6i-isp which registers its v4l2_device and media_device" did you mean sun6i-csi ? oh right, sorry mripard: we do need to start the sun6i-csi block to get data from the ISP I'd move the media_device and v4l2_device from sun6i-csi to sun6i-isp although how is it all described in DT today ? today there's a node for sun6i-csi and a node for the mipi csi-2 bridge are they connected using ports ? with fwnode graph between them yes good so I suppose you'll add an outport port for the sun6i-csi, connected to the sun6i-isp ? yes but that's not present in DT today, right ? correct bit of a shame that one :S it prevents moving the media device to the sun6i-isp are those IP cores reused on other SoCs ? only on Allwinner SoCs as far as I know connected in the same way ? or could we have, for instance, in a different SoCs, two CSI-2-RX + sun6i-csi pairs, both of them connected to the same ISP instance ? well the ISP can actually take input from multiple sun6i-csi instances yes for example on a given soc there's a sun6i-csi instance connected to a mipi csi-2 bridge and one for parallel and the ISP can take input from either one of them so in practice I was thinking 2 input fwnode ports for the isp paulk: is there an ISP on all the SoCs using the sun6i-csi IP? mripard: let me check so here's how I would do it in the sun6i-csi driver if there's no output port in DT don't change anything if there's an output port in DT don't register a media device and v4l2 device, but register and async subdev mripard: I think A64 doesn't have an ISP the ISP driver will register the media device and v4l2 device in that case that way you make the ISP the master component when present, but don't break backward compatibility with existing DT pinchartl: right I was about to suggest something like that so that's okay? and it will still work in systems that don't have an ISP it's OK for me :-) side question, do you have documentation on the ISP ? do you know how to program it ? I mean, is there a chance that it breaks if the isp node is status = "disabled" ? or if the driver is not enabled, etc if the driver isn't enabled it won't work, but I don't think it's an issue okay if the status is disabled, we can support that by checking in the sun6i-csi driver if the remote DT node is enabled or not okay fair enough seems like a fair middle ground thanks a lot :) you're welcome pinchartl: I know enough about the ISP yes what are your plans for the userspace part ? not docs though pinchartl: I think the initial submission will mostly be debayering and maybe scaling so not much to care about for userspace but it also does 3A and stuff like histogram so in the future it'll probably be a candidate for libcamera or such the API needs to be designed with the whole ISP in mind, so it's important to consider all the features to ensure we won't break backward compatibility in the future but honestly bootlin doesn't have plans to work on it for nowe pinchartl: definitely pinchartl: many of the blocks take very specific parameters for example there's a denoise filter which takes a table of parameters that's a common situation in ISPs there's also LUTs for dead pixel correction the standard practice these days is to put all the parameters in a buffer, with a metadata video node I see it depends a bit on the ISP architecture so I can't comment much there's no block diagram from allwinner, but based on your knowledge of the hardware, can you make one ? so I have a good idea of the different blocks in it, it's just unclear what the processing order is although I could take a look at newer generations even just a list of blocks would be a good start which do have a block diagram in the docs that's useful too it would be nice to test statistics support early does the hardware support memory-to-memory operation with the ISP, or is it limited to inline processing of data from the CSI bridge ? pinchartl: that's a good question pinchartl: we do see registers to take input from DRAM but I was unable to make them work pinchartl: I did contact Allwinner about them and they said they no longer have the info of how to make it work lovely :-) do you know if it's an IP core they have developed, or if they have licensed it ? it's also quite possible that the clocks/dma engine for it are broken/missing pinchartl: that is unclear too but I would tend to believe they did it in house pinchartl: it's been evolving quite a lot with allwinner platforms with a first generation that was tied to the CSI controller for sun4i-a10/sun7i-a20 then they made it standalone (sun6i) and now they've reworked it a bit for newer generations (it's called isp500, isp520 and isp522) ok I would also recommend looking at libcamera support early on ;-) so back to DRAM, I think I'll work with the assumption that this mode is not available but if someone ever manages to make it work, I guess it won't change the fundamental nature of the driver pinchartl: yeah that would definitely be the goal, but the scope that Bootlin has for now is more or less just debayering even for that considering backward compatibility you'll likely have to implement the input parameters buffer mechanism right away here's the list of modules btw https://paste.ack.tf/ceb73c and you don't want to force applications to deal with that thanks seems fairly standard pinchartl: I was thinking of using some v4l2 controls for easy stuff that won't scale when you'll need to add support for the rest it's a dead end I see, so no controls at all? I agree that it won't scale, but I was wondering if a mixed approach could work with a mixed approach you'll have to synchronize the controls with the parameters buffers, and thus use the request API it will increase the complexity, both in kernelspace and in userspace yeah better forget about that but then I guess that the parameter buffer must include data for all the modules from the start so I better understand why you emphasised that :) yep :-) you can look at the rkisp1 driver for an example of how it can be done yep, seems like I will end up doing something very similar hverkuil ok hans :) thanks for the tip mchehab: I don't think IRC logger works here latest log is from 1. June dwlsalmeida: ok, I'll take a look jernej thanks! :) mchehab: while we're at it: the links to the git repos here: https://linuxtv.org/repo/ do not work. They all contain '/cgit.cgi', which should be removed. dwlsalmeida: I see some bugs just from going over the code mchehab: thanks! I stumbled across that today, nice to have fixed. mchehab: if the logger can join the channel I suppose it will work dwlsalmeida: for example: VE_H264_VLD_ADDR_FIRST | VE_H264_VLD_ADDR_VALID |VE_H264_VLD_ADDR_LAST must always be kept together because all these 3 flags together tells that you have whole slice and VE_H264_VLD_OFFSET should always be set to 0 offset adjustment is done by skip function because of HW quirk some videos fail if you set offset directly to wanted position instead, you have to set offset to 0 and execute skip command, so internal pointer points to wanted offset dwlsalmeida: do you really need new control for array? size was present before but dropped during H264 control refinement offset is imho also not needed first slice in array should have sufficient header_bit_size set jernej I was told not to add anything to the current h264 controls since they're stable now jernej I missed your comment on the skip function in the code, you've warned against the very thing I tried to do heheh jernej the one piece that is missing to me is: where do we keep track of our current position in the capture buffer? Since I am calling setup() and trigger() in a loop now I assume this will overwrite the decoded data at every iteration I recommend that you split setup in two one for "do it once" and one for do it for every slice (probably majority) in "do it once" function, you set up buffer and every time slice is decoded, HW will keep pointer in buffer you just have to make sure header_bit_size is each slice is exactly right and I don't think you have to change any meaning in current controls for array of slices just mark V4L2_CID_STATELESS_H264_SLICE_PARAMS as array *as dynamic array from driver perspective, it doesn't matter if it contains only one or more elements code should be generic enough I also don't think your current approach calling setup() and trigger() in a loop would work ISR releases job and you don't want to do that you have to setup new decoding from ISR (imo not the best option) or suppress call to v4l2_m2m_buf_done_and_job_finish() in ISR if there are more slices to decode in current job albeit, first option is easier, because trigger() doesn't wait on interrupt for second option you would need some kind of semaphore > suppress call to v4l2_m2m_buf_done_and_job_finish() in ISR if there are more slices to decode in current job that's what I did, looks a bit messy though ngl. I totally skipped on the semaphore part though anyway, it will be hard for you to do it right if you don't have any HW to test on :) I have a board on the way, thanks for the tips, I really appreciated it :) yeah, that will be a problem for a while, just change it to gmail what is the status regarding hevc, btw? anything merged? jernej: hans sent the benjamin's g2 driver -- and today a new series adding some features was sent hverkuil: I'm reviewing Alex Bee's series, so please don't merge anything :) rockchip has this hantro-ish ip core which we made part of the hantro driver, but ndufresne and i are seriously thinking we could move that to its own driver. it's getting messier to have hantro as a unifying driver. ezequielg: did you have a chance to check my HEVC patch? dependent slice segments? yes didn't see it in detail, tbh. But John seemed happy so ;) I'm not sure if I should re-submit it with new e-mail