We did a Complex Camera workshop in Tokyo. The main focus of the workshop is
to allow supporting devices with MC-based hardware connected to a camera.
Presentations for the topics discused there are available here.
Linux Complex Cameras Workshop - 2018
Attendees
Sakari Ailus |
Jian Xu Zheng |
Jerry Hu |
Tomasz Figa |
Alexandre Courbot |
Laurent Pinchart |
Mauro Carvalho Chehab |
Kieran Bingham |
Niklas Söderlund |
Nicolas Dufresne |
Paul Elder |
Jacopo Mondi |
Ricky Liang |
Daniel Wu |
Hans Verkuil |
Javier Martinez Canillas |
Pavel Machek |
Slides and pictures
Group Photo
Introduction
New laptops are arriving without webcam support in the Linux Kernel due to 'complex camera' devices.
1. libv4l review
Mauro presented a short review of the libv4l stack.
- Contains libv4lconvert, a conversion and decompression library, but only accessible through device nodes abstraction;
- Emulates the old V4L1 API, this is considered ready for retirement;
- Finally, contains a general emulation of V4L2, which abstracts libv4lconvert, modified formats are flag emulated;
- Main issue, no more maintainers, causes issue if features get added in V4L2 API without equivalent emulation (notably CREATE_BUFS).
2. Intel IPU3
- Graphs and pipelines are ISP-specific concepts. They are passed to the firmware as configuration data so that the ISP gets configured to operate as the given pipeline. This isn't exposed through the MC API.
- Side Note: IPU3 have some similitude with Mali C71 ISP https://developer.arm.com/products/graphics-and-multimedia/mali-camera This ISP can work pixel, lines, frame bases, using M2M manner, but that's all there is public about it for now.
IPU3 parameter documentation
https://www.mail-archive.com/linux-media@vger.kernel.org/msg132827.html
Glossary
- P2P
Public to Private
- CPF Data
tuning data
- IQStudio
tuning tool
Notes
- ChromeOS camera HAL implementation is different from Android but the API is the same
Slicing
- The image may need to be processed by the ISP in several iterations
- The line buffer length is a physical property of an ISP; images wider than that require slicing
- Slicing may be handled in ISP specific code but not in application APIs
Planning
- Starting point for supporting regular V4L2 programs for complex cameras should be based on LD_PRELOAD as with libv4l, but not the current libv4l implementation.
- 3A algorithms generally take ISP statistics as input and produce sensor configuration (exposure time, gain) and ISP parameters as output;
Pure software implementation (without use of ISP statistics, without producing ISP parameters) may be used for development purposes but for performance reasons such a solution is not usable for video streaming
- Therefore any real-world solution needs to make use of an ISP
The target is to define APIs
- Towards 3A algorithms, which could be proprietary
- Towards applications
- Some hardware has SoC cameras (smart sensors) that have built-in ISP controlled through a high-level interface on the sensor. Such ISPs need no software control such as ISPs present on SoCs do. The designed APIs should not assume either kind of a system (SoC camera or raw sensor + ISP).
- GPUs could be used for processing images as well.
- Optimal image processing pipeline is somewhat hardware specific.
Fixed function and programmable hardware exists:
- Even programmable ISPs are much less generic than GPUs today
- No industry-wide APIs for ISPs
Fixed function pipeline may be assumed:
- This serves the widest set of ISPs
- If a programmable pipeline is needed it can be considered later on when there could be tangible benefit from it
3. Requirements
- Support for libv4l-based apps
- Support for closed source binary blobs (LD_PRELOAD)
- Support for closed source 3A algorithms with an external API
- Pipeline settings - configuration file + generic hander + possibility for vendor-specific code
- Provide the same API for both MC-based and devnode-based hardware
- Support for different profiles
- Sandbox support for closed source binary blobs
- Application needs: Frames and control
- Support Android HAL
- Use Android HAL as a source of requirements
- Port Andorid HAL and xawtv3 to use it
- Further specify the API to cover parts not specified in Android HAL v3 API:
- include profiles (viewfinder, still capture, etc.).
Android camera HAL v3 painpoints
The API is underspecified. It offers multiple ways to achieve the same result (for m streams, queuing n requests with m buffers each, or n*m requests with 1 buffer each). In practice only a subset of those options work for a given HAL implementation. Applications need to comply with those unwritten limitations, and HAL implementations in practice get tested with CTS and a few applications, and thus implement a subset of the options only.
The API doesn't carry stream usage information (viewfinder, video capture, still image capture, ...). That information is translated to requests, and the HAL implementation often uses heuristics to recover the usage information from the requests as the ISP often hardcodes (in hardware or firmware) use cases.
A request requires frames with a specific set of parameters to be captured to specific buffers. This is very difficult to achieve due to real time requirements in pipeline configuration. The HAL has to enqueue buffers at the exact right time, and when the time window to do so is missed, the needed image is captured to the wrong buffer. A memcpy() is then often the only option to recover. It would be much easier if a request instead required capturing the frame to any buffer from a given pool.
Supporting devices with multiple inputs
- Multiple inputs would be exposed as separate devices in the enumeration stage.
If we want to support TV inputs and/or HDMI inputs (recommanded IMHO!), then we need:
- setting/querying TV standards (*_STD ioctls)
- setting/querying video timings (*_DV_TIMINGS ioctls)
- hotplugging (events when the video signal goes away or a new stable video signal is detected)
- For testing in upstream one could use:
- Asus Tinkerboard (RK3288)
- i.MX6 (freescale) boards are also widely available:
- 96 Board ROCK960:
Configuration files used for IPU3 on Chrome OS:
Tuning data used on Chromebook with IPU3, with license:
4. Laurent's proposal
(Attach the diagram here):
3 types of APIs could be created:
- APIs between App and Camera stack
- APIs between Camera stack and pipeline handler
- APIs between Camera stack and 3A algos
5. Intel proposal
3A is more than 3A, there's a whole set of imaging algorithms that need a userspace control loop (digital video stabilization, tone mapping, ...)
Imaging algorithms are customized for a device through three sets of data:
- The algorithm implementation itself is ISP-specific
- A tuning data file provides information related to a particular sensor model (and VCM model) to fine-tune the algorithms
- Per-unit calibration data (usually stored in NVM) provide further information to fine-tune the algorithms to take instance-specific characteristics into account.
6. Pipewire introduction
- Started as a Dbus based solution to share videos (Pinos) + Audio = PipeWire
- Pipeline based - bridge between application and devices
- plugins to bind devices together (nodes) and a daemon to manage them
- library: manages creation and linking of objects; groups them in graphs (1 graph per pipewire instance)
- nodes are shareabled between instances (inter-process connections)
- memfd is used for buffer allocation/sharing: no copy/allocation
- event based processing (eg. buffer dequeue event on v4l2 source triggers processing in a connected object)
- graph topology is managed by pipewire clients that instantiate and link objects
- format and buffer negotiation happens at link creation time
- lossy: if clients do not consume buffers fast enough, data are overwritten by the source
- buffer are re-sizeable at runtime provided some constraint initially specified are respected
- v4l2-ctrls are modeled as configuration parameters between client and source, and applied at each frame capture (no support for multiple clients acting on parameters of the same source)
- explicit fences support: to be implemented, no arbitration issues
- image metadata can be piggy-backed on video buffers and consumed by clients
- only mmap support at the moment; no v4l2 emulation layer
- No DVB support planned - it should work in a similar way to the V4L2 source
- No plans for now for explicit fences
Next steps
- Question:
If someone wants to replace HW ISP with CUDA or OpenSL accelerator, anything to keep in mind ?
- Good to keep in mind, but short term, V4L2 drivers for ISP is more likely (IPU3/4, OMAP, ...)
mchehab