Kwiboo: Hi
why do we need to call pm_runtime_dont_use_autosuspend() in the error/remove path?
bbrezillon: presumably to keep the device state clean? although I don't understand why it couldn't be automatically called for us on unbind
I don't remember the need to call it before, though
bbrezillon: https://patchwork.kernel.org/patch/8275431/
tfiga: thx
mchehab: when do you plan to pull the last requests for v5.2 ? (if not done already :-))
mripard: tried to rebase the cedrus h264 series (as requested by mchehab) and I got sparse warnings. Can you fix this and post a v10?
mchehab, does your gpg supports ed25519 (elliptic curve), I signed git tags with ed25519
Kwiboo: could you have a look at https://github.com/bbrezillon/linux/commits/rk-vpu-mpeg2-v4 and let me know if you want me to change anything in there
pinchartl: I'd say up to Thrursday, as I'm intending to handle them on Friday
svarbanov: I suspect that the problem were not with the algorithm itself, but the e-mail you sent sounded truncated
anyway, here, gpg --version says that it supports the following algo:
Pubkey: RSA, RSA-E, RSA-S, ELG-E, DSA
Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH,
        CAMELLIA128, CAMELLIA192, CAMELLIA256
Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224
Compression: Uncompressed, ZIP, ZLIB, BZIP2
(I'm using gpg version 1.4.23)
that's the latest gpg packet on fedora 29: gnupg-1.4.23-2.fc29.x86_64
hmm... I have another gpg package here with version 2.2.13
with supports:
Pubkey: RSA, ELG, DSA, ECDH, ECDSA, EDDSA
Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH,
        CAMELLIA128, CAMELLIA192, CAMELLIA256
Hash: SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224
Compression: Uncompressed, ZIP, ZLIB, BZIP2
mchehab, ok, I'll send another pull request with two more patches and signed with RSA, and also I sent the key to few keyservers
svarbanov: ok, thanks!
svarbanov: not sure if git verify-tag calls gpg or gpg2 (or if it calls a gpg library internally)
mchehab: thank you
mchehab, sent  [GIT PULL for v5.2] Venus updates - take2, hope that it will be fine now
I'll check
mchehab, thanks!
paulk-leonov, Kwiboo: I was looking at the upper layer in ffmpeg hwaccel, I only see code clipping the NAL headers, but was left under the impression that Cedrus needed the slice header to be clipped, maybe you can give me some light ? Did I miss-interpret the bitstream that Cedrus needs &
* ?
ndufresne, mhh, depends on what you mean by clipped :)
I simply mean that the pointer passed would be after the type byte and the slice headers
* that my impression was that it was ^
clipped would mean forcibly copied into a fresh buffer I suppose ?
(or mem moved)
mhh not sure about that, there's a chance we just use the offset provided by vaapi and it just works
is that the offset in bits that is in VAAPI/V4L2 tables ?
ndufresne, yes
h264 has slice->header_bit_size = VASlice->slice_data_bit_offset;
ok, thanks, that is very interesting, https://github.com/intel/libva/blob/master/va/va.h#L3048
it removes a lot of the mistery around the emulation bytes at the same time
things are less clear when it comes to hevc though
I had to do some trickery in our libva backend code
so basically original buffer is  [AVC HEADER or START CODE][NAL UNIT BYTE][SLICE HEADER][SLICE DATA]
FFMPEG passes a pointer to the NAL UNIT BYTE, and set the offset (in bits) to the SLICE DATA
I wonder if it means the slice data is not always bytes aligned
it's not, for certain
and it says that the offset in bits is post removal of the emulation prevention bytes
which imply that the emulation bytes are present
or at least *can be* present
paulk-leonov, I was confused, because a very old discussion was speaking about creating slice headers from the controls, but this has never been needed, since it's passed to the driver, even if not always used
what I'm thinking, is that if we make emulation prevention bytes mandatory (I'll have to bootstrap my H3 really), we could abstract RK and Cedrus in the end
since prepending startcode in RK driver would not really be a performance blocker
ndufresne, mhh, how do you prepend in the driver?
but if emulation prevention bytes (let call these EPB) should not be present in what we pass to Cedrus, it's a different story
without allocating a new buffer and copying?
it's called data_offset
oh right
I keep forgetting about that
for once we have a valid use case
that basically makes the need for an annex-b format go away
note that ffmpeg always copy into v4l2 buffers for encoded stream, no one seems to have noticed
well, rather, the "mixed annex-b + slice header in control" out of my list
yeah sure
yeah, but I really need a way to validate the EPB stuff on cedrus
bbrezillon, ezequielg ^ you should read last 20 lines
hehe
was doing it
thanks
need someone to tell me if I'm dreaming something ;-D
ndufresne, we need to talk about DPB management yeah
I think there's something off with the cedrus h264 implementation about that
right, theoritically, all I'm missing right now would be the reference frame type (IPB) on the DPB
* DPB Entry
oh
ndufresne, btw it would be really good to list precisely what's missing from controls for rk
maybe it's related to ayaka's comment, it was forumlated as a rant, so I ignored
gotta leave very soon (catching a train)
but tl;dr the dpb we pass has evicted entries too, when it shouldn't
I'll make sure to work on this with bbrezillon
if that's really required by the hw, we should do it in the driver
(I think we discussed it already some days ago)
bbl
paulk-leonov, oh, right, I saw this bug
you cannot drop a lost reference like you do now
actually, that one is not the same bug
evicted entries ... maybe a bug in the flushing on I frames ?
* I meant on IDR frames, I frame don't flush the DPB
I wonder if there is a minimally working free validation set for our codecs ...
maybe the ffmpeg tests are sufficient
paulk-leonov, Kwiboo: Anyone tried to run ffmpeg tests against Cedrus ?
tfiga, to save you some reading, I have one question that resume our conversation here
tfiga, do you think it is possible to deduce p/b0/b1 list for RK if we add the reference frame type (I/P/B) in the DPB entry ? if not why ?
bbrezillon, I will take a look (and test) your rk-vpu-mpeg2-v4 branch shortly, will get back to you
ndufresne, the ffmpeg hwaccel should set same ctrl values as libva-v4l2-request, I checked how field values transfered from ffmpeg <-> vaapi <-> libva-v4l2-request in the begining and also added a few ctrl values that was not set by libva-v4l2-request
a great, so I just need to dump in vaapi on my PC if I need some info
thanks @Kwiboo
there is also a possibility we added a new ctrl field to support some feature that was missing in cedrus, not sure jernej pushed that change, could have been for hevc or h264
if you plan to look at ffmpeg code you will se a lot of similarities with the ffmpeg vaapi and vdpau hwaccels, they was used as reference
I have not run any test suite on hwaccel, just testing different samples of working/broken samples and look if they show up correctly on screen
the dpb may differ a little bit from vaapi, there was a lot of testing with different ordering and what entries to include, the order or what pictures that was included did not seem to matter for cedrus
there is/was an issue in cedrus in that if the buffer used as output target was also included in the dbp wrong "position" got used and wrong motion vectors etc could be used
the DPB should be as the spec require imho though
but noted
Kwiboo, do you have specific patches around the dpb in ffmpeg, since from my first read it comes from the same place for all accel
and does this special case only happen if you have broken slice ?
no we use plain ffmpeg (4.0) with only a few patches, mostly for rpi and rkmpp decoder at https://github.com/LibreELEC/LibreELEC.tv/tree/master/packages/multimedia/ffmpeg/patches, nothing should change how dpb or other decoding is handled
https://github.com/LibreELEC/LibreELEC.tv/blob/master/projects/Allwinner/patches/linux/0007-H264-improvements.patch#L37-L40  <-- this is the change that was made in cedrus to be able to handle slice or field encoded videos
and the related code for position
if destination buffer is in dbp, use that buffer as output target instead of a new "position"
ah, so you mean that if the timestamp in the reference is one of the currently being decoded frame, it would not find it
which makes sense, the reference is cleared on qbuf
that was with interlaced content ?
I think it was interlaced content
yes, it was
after we fixed this API bug, you won't have to deq/queue in decode_slice, so that problem should go away
ndufresne: Are you talking about multi-slice frames?
it is not that timestamp ref was clered, it is an issue in cedrus not api
jernej, yes, that would also fix this interlaced case
not sure, as Kwiboo says, it's driver issue, not API
well, you cannot reference the same frame as the one you are decoding to in current v4l2 API
why not, seems to work with cedrus
because the we clear references from the list on qbuf
I mean, otherwise you need to make sure to "assign" the new capture buffer to the current TS before or something, maybe that works, but it seems a bit ...
is that something new or recent (last 3 months) that has changed? because this was working fine when this was tested and fixed in cedrus
I think it's introduced by the use of timestamp
since it's the driver that need to pass somehow the timestamp from output to capture buffers
with index it would not have been an issue
anyway, split slices in interlaced content, or multi-slices are the same
yep, I think that should happen before decoding if anyhing and not after decoding is finished, need to check maybe this is not working as expected with latest v4l2 core
we need to cleanly handle both
in H265, there is only split slicing support iirc
the encoded stream is never "interleaved" if I can abuse that term
btw what vpu hw are you working with on rk? rkvdec? the vpu1/2 h264 has a start_code_e register that seems to control how stream should be parsed
bbrezillon is working on the driver, we target vpu2 for now
I'm just handling the userspace, which when I looked I was sad to see that things were going in split directions
So I'm working (part time) on doing as much reconciliation as I can
(for the context, my past experience with this is limited to Intel VAAPI, I did some maintenance on gstreamer-vaapi, there was issues handling lost field of interlaced content)
so API wise, we are missing the equivalent of reference field in other stack, which tells which field (top/bottom) is being referenced
and we are missing the ability to batch slices, forcing use to dequeue and queue again the capture buffer
didn't you said the other day that multi-slice frames are meant as speed improvement, so batching slices would be detrimental to their purpose
?
if vpu2 is the current target I guess hw reg for start_code_e could be used and same stream can be used as cedrus?, it seems to be set to 1 in mpp library but other hantro g1 code seems to set it when "NAL start prefix in stream start is 0 0 0 or 0 0 1" and some "frame_num workaround" is not used
ndufresne: not sure which one is vpu2, but we need to support both rk3399 and rk3288
Kwiboo: ^
me neither, I just repeated what ezequielg told me to be honnest
rk3288 has vpu1+hevc and rk3399 has vpu2+rkvdec (both can decode h264)
so we need both :)
vpu2 also embeds an h264 decoder?
thought h264 decoding was left to the vdec block
yes both should have h264 decoder but vdec probably have better support for 4k
ok
ndufresne: had a look at the 3 ref_list vs 2 ref_list scheme, and I also have the feeling one format can be turned into the other
ok, so let's try and confirm this
we have asked Maxime to re-add those list, so we should maybe avoid dropping and reading again
sure
you probably notice that those list were per frame, instead of a per slice, that's because the code that existed before didn't support multi-slice, or interleced
ndufresne: have to go
see you o/
feel free to continue the discussion
I'll read the backlog tomorroz
ok
tomorrow
regarding h264 and vpu1+2 I just pushed a commit to https://github.com/Kwiboo/rockchip-vpu-regtool/commit/5fd261a5777c9b8f91c03a2d7ab43e8b969ee943 that adds basic boilerplate for setting the hw regs
sorry my ignorance, what is this tool for ?
generating vpu1/2 hwreg code, I did not want to do manual sync of the mpeg-2 hwreg code
reads txt-files with hwreg = value and generates starting code
so it generate the reg for a specific slice setup ?
turns https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/mpeg2.txt into https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/rk3399_vpu_hw_mpeg2_dec.c (and rk3288 counterpart)
and that code is the base for the mpeg2 decoder in current patchset
I only ever finished and validated the mpeg2 regs file, the h264/mpeg4/vc1 files is not complete and lots of other code needs to be added in addition to setting hw regs
My strategy was to pass the same file over the mpp vendor stack (with traces that dumps registers
I had a board setup with manual build of mpp
ahh that should also work :), I did a grep of "->sw" on https://github.com/rockchip-linux/mpp/blob/release/mpp/hal/rkdec/h264d/hal_h264d_vdpu2.c to see what regs gets set, then some manual lookup / translate into similar condition using v4l2 ctrl param
bbrezillon, I found one issue with v4, same issue with double free of video device on rmmod, https://github.com/Kwiboo/linux-rockchip/commit/117023f05b3ae72befbb6ec8c6349184cc0dd5f4 seems to fix the issue
reverting to video_device_alloc() instead of devm_kzalloc() is also an option
a small nit for symmetry would be to move rk3288_vdpu_irq() (and related .vdpu_irq = rk3288_vdpu_irq) from rk3288 commit to "Add decoder boilerplate" commit as that is where the rk3399 counterpart is added
same could be done for rk3288/rk3399_vpu_dec_reset, moved from mpeg-2 commits to "Add decoder boilerplate", but they are only referenced once decoder is added
Kwiboo: or move the rk3399 bits to commit "rockchip/vpu: Add MPEG2 decoding support to RK3399"
yep, up to you :-), those functions originate from the chromium vpu driver
Kwiboo: sorry, brain fart, those are generic irq handlers for the decoder, they should definitely be part of the "Add decoder boilerplate" commit
hm, not sure it's really important anyway, as those are only be used when we start adding decoder formats
Kwiboo: I force pushed to my v4 branch
if you're fine with this version I'll send it tomorrow
sure, looks good, thanks :-)