Kwiboo: Hi why do we need to call pm_runtime_dont_use_autosuspend() in the error/remove path? bbrezillon: presumably to keep the device state clean? although I don't understand why it couldn't be automatically called for us on unbind I don't remember the need to call it before, though bbrezillon: https://patchwork.kernel.org/patch/8275431/ tfiga: thx mchehab: when do you plan to pull the last requests for v5.2 ? (if not done already :-)) mripard: tried to rebase the cedrus h264 series (as requested by mchehab) and I got sparse warnings. Can you fix this and post a v10? mchehab, does your gpg supports ed25519 (elliptic curve), I signed git tags with ed25519 Kwiboo: could you have a look at https://github.com/bbrezillon/linux/commits/rk-vpu-mpeg2-v4 and let me know if you want me to change anything in there pinchartl: I'd say up to Thrursday, as I'm intending to handle them on Friday svarbanov: I suspect that the problem were not with the algorithm itself, but the e-mail you sent sounded truncated anyway, here, gpg --version says that it supports the following algo: Pubkey: RSA, RSA-E, RSA-S, ELG-E, DSA Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH, CAMELLIA128, CAMELLIA192, CAMELLIA256 Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224 Compression: Uncompressed, ZIP, ZLIB, BZIP2 (I'm using gpg version 1.4.23) that's the latest gpg packet on fedora 29: gnupg-1.4.23-2.fc29.x86_64 hmm... I have another gpg package here with version 2.2.13 with supports: Pubkey: RSA, ELG, DSA, ECDH, ECDSA, EDDSA Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH, CAMELLIA128, CAMELLIA192, CAMELLIA256 Hash: SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224 Compression: Uncompressed, ZIP, ZLIB, BZIP2 mchehab, ok, I'll send another pull request with two more patches and signed with RSA, and also I sent the key to few keyservers svarbanov: ok, thanks! svarbanov: not sure if git verify-tag calls gpg or gpg2 (or if it calls a gpg library internally) mchehab: thank you mchehab, sent [GIT PULL for v5.2] Venus updates - take2, hope that it will be fine now I'll check mchehab, thanks! paulk-leonov, Kwiboo: I was looking at the upper layer in ffmpeg hwaccel, I only see code clipping the NAL headers, but was left under the impression that Cedrus needed the slice header to be clipped, maybe you can give me some light ? Did I miss-interpret the bitstream that Cedrus needs & * ? ndufresne, mhh, depends on what you mean by clipped :) I simply mean that the pointer passed would be after the type byte and the slice headers * that my impression was that it was ^ clipped would mean forcibly copied into a fresh buffer I suppose ? (or mem moved) mhh not sure about that, there's a chance we just use the offset provided by vaapi and it just works is that the offset in bits that is in VAAPI/V4L2 tables ? ndufresne, yes h264 has slice->header_bit_size = VASlice->slice_data_bit_offset; ok, thanks, that is very interesting, https://github.com/intel/libva/blob/master/va/va.h#L3048 it removes a lot of the mistery around the emulation bytes at the same time things are less clear when it comes to hevc though I had to do some trickery in our libva backend code so basically original buffer is [AVC HEADER or START CODE][NAL UNIT BYTE][SLICE HEADER][SLICE DATA] FFMPEG passes a pointer to the NAL UNIT BYTE, and set the offset (in bits) to the SLICE DATA I wonder if it means the slice data is not always bytes aligned it's not, for certain and it says that the offset in bits is post removal of the emulation prevention bytes which imply that the emulation bytes are present or at least *can be* present paulk-leonov, I was confused, because a very old discussion was speaking about creating slice headers from the controls, but this has never been needed, since it's passed to the driver, even if not always used what I'm thinking, is that if we make emulation prevention bytes mandatory (I'll have to bootstrap my H3 really), we could abstract RK and Cedrus in the end since prepending startcode in RK driver would not really be a performance blocker ndufresne, mhh, how do you prepend in the driver? but if emulation prevention bytes (let call these EPB) should not be present in what we pass to Cedrus, it's a different story without allocating a new buffer and copying? it's called data_offset oh right I keep forgetting about that for once we have a valid use case that basically makes the need for an annex-b format go away note that ffmpeg always copy into v4l2 buffers for encoded stream, no one seems to have noticed well, rather, the "mixed annex-b + slice header in control" out of my list yeah sure yeah, but I really need a way to validate the EPB stuff on cedrus bbrezillon, ezequielg ^ you should read last 20 lines hehe was doing it thanks need someone to tell me if I'm dreaming something ;-D ndufresne, we need to talk about DPB management yeah I think there's something off with the cedrus h264 implementation about that right, theoritically, all I'm missing right now would be the reference frame type (IPB) on the DPB * DPB Entry oh ndufresne, btw it would be really good to list precisely what's missing from controls for rk maybe it's related to ayaka's comment, it was forumlated as a rant, so I ignored gotta leave very soon (catching a train) but tl;dr the dpb we pass has evicted entries too, when it shouldn't I'll make sure to work on this with bbrezillon if that's really required by the hw, we should do it in the driver (I think we discussed it already some days ago) bbl paulk-leonov, oh, right, I saw this bug you cannot drop a lost reference like you do now actually, that one is not the same bug evicted entries ... maybe a bug in the flushing on I frames ? * I meant on IDR frames, I frame don't flush the DPB I wonder if there is a minimally working free validation set for our codecs ... maybe the ffmpeg tests are sufficient paulk-leonov, Kwiboo: Anyone tried to run ffmpeg tests against Cedrus ? tfiga, to save you some reading, I have one question that resume our conversation here tfiga, do you think it is possible to deduce p/b0/b1 list for RK if we add the reference frame type (I/P/B) in the DPB entry ? if not why ? bbrezillon, I will take a look (and test) your rk-vpu-mpeg2-v4 branch shortly, will get back to you ndufresne, the ffmpeg hwaccel should set same ctrl values as libva-v4l2-request, I checked how field values transfered from ffmpeg <-> vaapi <-> libva-v4l2-request in the begining and also added a few ctrl values that was not set by libva-v4l2-request a great, so I just need to dump in vaapi on my PC if I need some info thanks @Kwiboo there is also a possibility we added a new ctrl field to support some feature that was missing in cedrus, not sure jernej pushed that change, could have been for hevc or h264 if you plan to look at ffmpeg code you will se a lot of similarities with the ffmpeg vaapi and vdpau hwaccels, they was used as reference I have not run any test suite on hwaccel, just testing different samples of working/broken samples and look if they show up correctly on screen the dpb may differ a little bit from vaapi, there was a lot of testing with different ordering and what entries to include, the order or what pictures that was included did not seem to matter for cedrus there is/was an issue in cedrus in that if the buffer used as output target was also included in the dbp wrong "position" got used and wrong motion vectors etc could be used the DPB should be as the spec require imho though but noted Kwiboo, do you have specific patches around the dpb in ffmpeg, since from my first read it comes from the same place for all accel and does this special case only happen if you have broken slice ? no we use plain ffmpeg (4.0) with only a few patches, mostly for rpi and rkmpp decoder at https://github.com/LibreELEC/LibreELEC.tv/tree/master/packages/multimedia/ffmpeg/patches, nothing should change how dpb or other decoding is handled https://github.com/LibreELEC/LibreELEC.tv/blob/master/projects/Allwinner/patches/linux/0007-H264-improvements.patch#L37-L40 <-- this is the change that was made in cedrus to be able to handle slice or field encoded videos and the related code for position if destination buffer is in dbp, use that buffer as output target instead of a new "position" ah, so you mean that if the timestamp in the reference is one of the currently being decoded frame, it would not find it which makes sense, the reference is cleared on qbuf that was with interlaced content ? I think it was interlaced content yes, it was after we fixed this API bug, you won't have to deq/queue in decode_slice, so that problem should go away ndufresne: Are you talking about multi-slice frames? it is not that timestamp ref was clered, it is an issue in cedrus not api jernej, yes, that would also fix this interlaced case not sure, as Kwiboo says, it's driver issue, not API well, you cannot reference the same frame as the one you are decoding to in current v4l2 API why not, seems to work with cedrus because the we clear references from the list on qbuf I mean, otherwise you need to make sure to "assign" the new capture buffer to the current TS before or something, maybe that works, but it seems a bit ... is that something new or recent (last 3 months) that has changed? because this was working fine when this was tested and fixed in cedrus I think it's introduced by the use of timestamp since it's the driver that need to pass somehow the timestamp from output to capture buffers with index it would not have been an issue anyway, split slices in interlaced content, or multi-slices are the same yep, I think that should happen before decoding if anyhing and not after decoding is finished, need to check maybe this is not working as expected with latest v4l2 core we need to cleanly handle both in H265, there is only split slicing support iirc the encoded stream is never "interleaved" if I can abuse that term btw what vpu hw are you working with on rk? rkvdec? the vpu1/2 h264 has a start_code_e register that seems to control how stream should be parsed bbrezillon is working on the driver, we target vpu2 for now I'm just handling the userspace, which when I looked I was sad to see that things were going in split directions So I'm working (part time) on doing as much reconciliation as I can (for the context, my past experience with this is limited to Intel VAAPI, I did some maintenance on gstreamer-vaapi, there was issues handling lost field of interlaced content) so API wise, we are missing the equivalent of reference field in other stack, which tells which field (top/bottom) is being referenced and we are missing the ability to batch slices, forcing use to dequeue and queue again the capture buffer didn't you said the other day that multi-slice frames are meant as speed improvement, so batching slices would be detrimental to their purpose ? if vpu2 is the current target I guess hw reg for start_code_e could be used and same stream can be used as cedrus?, it seems to be set to 1 in mpp library but other hantro g1 code seems to set it when "NAL start prefix in stream start is 0 0 0 or 0 0 1" and some "frame_num workaround" is not used ndufresne: not sure which one is vpu2, but we need to support both rk3399 and rk3288 Kwiboo: ^ me neither, I just repeated what ezequielg told me to be honnest rk3288 has vpu1+hevc and rk3399 has vpu2+rkvdec (both can decode h264) so we need both :) vpu2 also embeds an h264 decoder? thought h264 decoding was left to the vdec block yes both should have h264 decoder but vdec probably have better support for 4k ok ndufresne: had a look at the 3 ref_list vs 2 ref_list scheme, and I also have the feeling one format can be turned into the other ok, so let's try and confirm this we have asked Maxime to re-add those list, so we should maybe avoid dropping and reading again sure you probably notice that those list were per frame, instead of a per slice, that's because the code that existed before didn't support multi-slice, or interleced ndufresne: have to go see you o/ feel free to continue the discussion I'll read the backlog tomorroz ok tomorrow regarding h264 and vpu1+2 I just pushed a commit to https://github.com/Kwiboo/rockchip-vpu-regtool/commit/5fd261a5777c9b8f91c03a2d7ab43e8b969ee943 that adds basic boilerplate for setting the hw regs sorry my ignorance, what is this tool for ? generating vpu1/2 hwreg code, I did not want to do manual sync of the mpeg-2 hwreg code reads txt-files with hwreg = value and generates starting code so it generate the reg for a specific slice setup ? turns https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/mpeg2.txt into https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/rk3399_vpu_hw_mpeg2_dec.c (and rk3288 counterpart) and that code is the base for the mpeg2 decoder in current patchset I only ever finished and validated the mpeg2 regs file, the h264/mpeg4/vc1 files is not complete and lots of other code needs to be added in addition to setting hw regs My strategy was to pass the same file over the mpp vendor stack (with traces that dumps registers I had a board setup with manual build of mpp ahh that should also work :), I did a grep of "->sw" on https://github.com/rockchip-linux/mpp/blob/release/mpp/hal/rkdec/h264d/hal_h264d_vdpu2.c to see what regs gets set, then some manual lookup / translate into similar condition using v4l2 ctrl param bbrezillon, I found one issue with v4, same issue with double free of video device on rmmod, https://github.com/Kwiboo/linux-rockchip/commit/117023f05b3ae72befbb6ec8c6349184cc0dd5f4 seems to fix the issue reverting to video_device_alloc() instead of devm_kzalloc() is also an option a small nit for symmetry would be to move rk3288_vdpu_irq() (and related .vdpu_irq = rk3288_vdpu_irq) from rk3288 commit to "Add decoder boilerplate" commit as that is where the rk3399 counterpart is added same could be done for rk3288/rk3399_vpu_dec_reset, moved from mpeg-2 commits to "Add decoder boilerplate", but they are only referenced once decoder is added Kwiboo: or move the rk3399 bits to commit "rockchip/vpu: Add MPEG2 decoding support to RK3399" yep, up to you :-), those functions originate from the chromium vpu driver Kwiboo: sorry, brain fart, those are generic irq handlers for the decoder, they should definitely be part of the "Add decoder boilerplate" commit hm, not sure it's really important anyway, as those are only be used when we start adding decoder formats Kwiboo: I force pushed to my v4 branch if you're fine with this version I'll send it tomorrow sure, looks good, thanks :-)