Designing MAM Systems: Part 3 - Integrating Systems & The Impact Of AI

This series provides a unique reference resource for those specifying, deploying and maintaining Media Asset Management Systems for production and archiving. It describes the types, terminology and technology of Media Asset Management systems. It discusses architecture, metadata, approaches to automation, integration and the pros and cons of bespoke Vs off-the-shelf systems.

Media Asset Management systems are a fundamental element of the broadcast media supply chain. Although systems for news, delivery, production and archiving have many shared core technologies, each has evolved its own nuanced functionality to fit workflow demands. Many vendors offer different applications accordingly. This guide explores the fundamental principles of Media Asset Management system design and implementation with a firm eye on production and archiving... but many of the standards and best practices discussed here apply across all MAM systems.

Media Asset Management systems can become more complex as they evolve. The key to achieving long term success lies in a solid understanding of the underlying technology, design principles and terminology. Careful consideration of the scope of the system and following established standards and protocols which will streamline future expansion is critical.

Managing assets becomes a bigger and more complex task as the scope and scale of ingested content increases. AI provides leverage when wrangling these huge quantities of media assets, but knowing what happens under the hood of an asset management system is key to deploying it successfully.

Integrating Third Party Products With Custom Extensions

Hardware engineers will be familiar with introducing AV glue into their architecture designs. This integrates devices from different manufacturers so they can exchange content and metadata. Whilst both devices may state that they adhere to a common standard, conflicting profiles and levels or sub-optimal implementations can prevent it working. The AV glue rectifies that.

It is the same with software-based solutions in IP workflows. Buying off-the-shelf content management tools will rarely provide everything you need in the longer term. Additional asset managers, external database tables, ancillary software tools will often need to be integrated with the proprietary systems. Software glue does the same thing.

From past experience with integrating multiple closed systems, the lowest risk approach treats a third-party system as read only. Never try to alter the internal structure of a proprietary database. Linking to it via queries and adding relational joins to supplementary tables to ONLY read data does no damage. Sometimes proprietary systems will provide an API to support user data input. Using that would ensure the internal data structures are undamaged.

Managing Multiple Asset Stores

Choosing the optimum asset management software is going to be different for every organization. To start with, a single storage container may not be the best solution. Choosing an asset manager that is very good for wrangling video clips may not be optimal for images for example. Collating sound effects is different to managing music tracks. Managing collections of classical orchestral music requires several additional metadata fields that are not needed for popular music. Classical material is also usually organized by composer rather than the recording artist.

Perhaps you could structure the storage systems around different types of media to start with. Then add a supervizing layer on top that allows searches to be federated across all of them. If the search results can then be collected into a basket to be forwarded to the NLE as a project, the system approaches the capabilities that Final Cut Server provided. Some of these media types could be split into multiple separate containers for each sub-category.

The librarian layer sits above all of the different repositories and provides a means to search for related assets and browse the collection as a whole. Where assets are designed to be used together, that grouping is managed in the supervizing librarian layer.

This list is not exhaustive and other content types will extend this basic concept.

Asset typeDescription
Video clip storeIngested assets arriving from the field or third parties. This could include finished programs but perhaps they ought to be stored separately in another storage system.
Audio clip storeAudio clips in an ST 2110 environment are stored separately to the video. They could be gathered separately or embedded into the video at ingest time. More likely, production for podcasts and radio broadcasts could keep their audio clips here. A separate store for music tracks could be provided.
Sound effects storeSound effects don't easily file with collections of music. The properties they need are different and their groupings are thematic rather than the way music albums and playlists are constructed.
Image collectionPhotographic and pixel-based images are collected here. Again this could be split to store photographs separately from photoshopped images. Iconic images might be split off to another store.
Vector image storeVector images are useful for storing line drawings in a compact way. These will be most useful for online content delivery. User interfaces are increasingly using vector graphics and map data lends itself to this format.
3D model store3D models are useful for creating illustrations for news and sport content but they are sufficiently different to illustrations to justify collecting them in a separate container.
Text fragment storeArbitrary fragments of text can be stored here. If the text objects have a time property and a program UUID associated with them, this could be a way to store sub-title or strap-line texts although there may be better solutions using VTT files.
Text document storeLarger bodies of text are stored separately from text fragments.
Script storeScripts containing the spoken dialog can carry useful metadata and could be synchronized with speech recognition to provide more robust subtitling.

Import-export Issues

Moving metadata from one repository to another should always preserve all of the data so the transaction can be reversed to send the media back to the source system without losses.

This requires that the target metadata repository has a corresponding field for every property in the source metadata repository. The target repository could support extra fields as well, but those additional fields cannot be transported back to the source.

Any loss of data during a transport process is detrimental and could be very subtle. Here are some examples:

Source dataTarget system
Floating point value.Integer container loses the value following the decimal point.
Four-digit year.Two-digit database field. The archetypal Y2K problem.
Variable length text.Shorter fixed length text field crops off additional characters.
10-bit video.8-bit video storage loses quality via a transfer function.
HDR images.Standard dynamic range image containers lose HDR information.
Lossless audio.Compressed audio such as MP3 or AAC are always lossy.
Unicode text.ISO 8859 text store loses some character values which are replaced by a missing character glyph. Can be cured with escape sequences.
Unicode text.ASCII text store loses even more character values than ISO 8859 without marking them with a missing character glyph because there isn't one. Can also be cured with escape sequences.
Tab separated data.Comma separated storage potentially introduces extra field delimiters due to embedded unescaped comma characters in the body text. Process to escape the commas on arrival.
Macintosh files with resource forks (historical).Detachment of resource forks in non-MacOS file sharing servers.
Files containing 4GBytes of data.Truncated to 2GBytes in some file systems.
8-bit text.7-bit text storage loses the most significant bit which alters the characters that are represented by the code points 128 to 255. Use an escape sequence to represent 8-bit characters.
Dates prior to January 1st 1901.Not represented by some base timestamp models. Must be expressed as text. Note this is not the same as the UNIX 32-bit timestamp limitation.
Dates after January 19th 2038.Cannot be represented in the UNIX signed 32-bit timestamp format. Must be expressed as text. Like the Y2K problem but much worse consequences because the 32-bit signed dates wrap around to December 1901. Solved by using 64-bit values.
Dates prior to December 13th 1901.Cannot be represented by UNIX 32-bit timestamps. Solved by using 64-bit values.

The Impact Of AI On Asset Management

Media Asset Management is an area where AI has a lot to offer. The industry is already being transformed by the use of Machine Learning. AI can certainly offer help when ingesting content and provides more assistance when searching.

We will be getting into the subject of AI more deeply in another series. Perceptive AI covers a whole spectrum of activity from simple deductions to advanced pattern recognition. For now, let’s examine some simple requirements for Perceptive AI tools that extract metadata or enhance the results of search engines.

AI does provide some useful benefits when extracting information logged by a camera for storage in the metadata repository. However, is this really a sophisticated AI process or just simple transformations of embedded metadata? It is beneficial either way and delivers information that location crews rarely input properly or completely.

We might expect professional digital cameras to at least log the information that a mobile phone records and a lot more besides:

  • Date of shot.
  • Time of shot.
  • GPS location of shot.
  • Directional orientation of camera (if the camera is smart enough).
  • Azimuth (tilt of camera if it is smart enough).
  • Altitude.
  • Camera mode.
  • Exposure settings.
  • Shutter details.
  • Lens characteristics.
  • Focus.
  • Resolution.
  • Frame-rate.
  • Color model.
  • Sensor type.
  • Imaging sensor response curve.

Some of these values might need to be logged on a frame-by-frame basis when shooting in handheld mode or with Drones and Aerial photography.

Other information that AI techniques can help extract concerns the cast and crew participating in the shoot. If AI has access to all of the documentation related to the pre-shoot planning, much of that might be inferred.

Each individual field in the metadata database will need a separate AI prompt to extract the specific information required. Similar perhaps to a database query in a conventional system. This would also be working with a Small Language Model (SLM) based on a contextually focused set of training data.

AI is certainly useful when logging points of interest in sports footage. For example we have known for a long time that the compressed video stream bit-rate bursts and the audio amplitude increases significantly when a goal happens in a football match. AI should be able to detect those events in video without any difficulty. They could made available to the presenting commentators and pundits for near instantaneous action replays.

Taking Perceptive AI Results At Face Value Can Be Risky

Machine learning can deduce things about the content but there are some interesting issues with the analysis that need to be taken into account.

At a demonstration of the IBM AI work in progress, I saw an image recognizer deduce the correct name of a celebrity with a confidence level of about 90%.  It also simultaneously deduced the gender of that celebrity with only a 65% confidence. Given that the celebrity gender was already well-known this was odd.

The two ML pattern recognition results were arrived at completely independently. Introduce an additional layer of logic to rank the results according to the confidence level. Then if possible, override the lower confidence results by inferring them from higher ranking ones.

I don’t think this is yet an exact science. Perhaps that ranking can be automated, but some capacity limited human moderation might be necessary.

This is all analogous to the quality of speech to text recognition systems that are not always completely accurate. The fundamental question is, “What level of inaccuracy is tolerable” because that translates directly into the quality of our metadata when AI is generating it for us. For the time being, it could be checked by human operators to measure the accuracy.

Relevant Standards & Other Resources

ISO has codified asset management principles and published them as the ISO 55000 series. This breaks down as:

  • Physical asset management.
  • Infrastructure asset management.
  • Fixed assets management.
  • IT asset management.
  • Digital asset management.

The ISO 55000 series is promoted by the Institute of Asset Management. Access their website here:

https://theiam.org/

It currently comprises three ISO standards (55000, 55001 and 55002).

The ISO 55000 standard indicates that there are asset management solutions for the entire infrastructure and enterprise. They may not be directly connected to our media records but the ISO 55000 IT assets will inform how workflow processes operate. Likewise any description of storage systems and hardware assets can be referenced as location accessors where media essence assets reside.

These documents are relevant sources of information and important specifications when designing your content management system:

StandardEditionDescription
RFC 8221982Standard for the format of ARPA internet text messages. There are later RFCs that supersede RFC 822.
RFC 13271992Mapping between X.400(1988)/ISO 10021 and RFC 822.
RFC 15211993Multipurpose Internet Mail Extensions (MIME). Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies.
RFC 15411993Dynamic Host Configuration Protocol (DHCP).
RFC 17661995Tags for the Identification of Languages.
RFC 24131998Dublin Core Metadata for Resource Discovery.
ISO 6392023Code for individual languages and language groups.
ISO 31662020The International Standard for country codes and codes for their subdivisions.
ISO 3166-12020Part 1 defines codes for the names of countries, dependent territories, and special areas of geographical interest.
ISO 3166-1 alpha 22020This section describes two letter country codes.
ISO 3166-1 alpha 32020This section describes three letter country codes.
ISO 3166-1 numeric2020This section describes three-digit numeric country codes as defined by the United Nations.
ISO 3166-22020Part 2 is the country subdivision code. This defines codes for the names of the principal subdivisions (e.g., provinces, states, departments, regions).
ISO 3166-32020Part three lists the codes for formerly used names of countries. It defines codes for country names which have been deleted from ISO 3166-1 since its first publication in 1974.
ISO 86012019Date and time formats
ISO 8601-12019Part 1: Basic rules
ISO 8601-22019Part 2: Extensions
ISO 8859VariousA 16-part standard describing 8-bit character code tables. Based on ASCII, each part addresses a different culture or national language with special glyphs incorporated for each.
ISO 8859-11998Latin-1 Western European character set.
ISO 100212003Message Handling Systems (MHS).
ISO 10021-12003MHS System and service overview.
ISO 10021-22003MHS Overall architecture.
ISO 10021-3WithdrawnAbstract Service Definition Conventions.
ISO 10021-42003MHS Message transfer system - Abstract service definition and procedures.
ISO 10021-51999MHS Message store: Abstract service definition
ISO 10021-62003MHS Protocol specifications.
ISO 10021-72003MHS Interpersonal messaging system.
ISO 10021-81999Electronic Data Interchange Messaging Service.
ISO 10021-91999MHS - Electronic Data Interchange Messaging System.
ISO 10021-101999MHS routing.
ISO 10021-111999MHS Routing - Guide for messaging systems managers.
ISO 106462025An ISO published version of the Unicode Universal Character Set. Not usually as up to date as the Unicode specification. Amendments (2025) are published to carry the latest changes published by Unicode. Currently under review.
ISO 158362019Information and documentation - The Dublin Core metadata element set.
ISO 15836-12017Dublin Core Part 1: Core elements.
ISO 15836-22019Dublin Core Part 2: Properties and classes.
ISO 180162003Information technology — Message Handling Systems (MHS): Interworking with Internet e-mail.
ISO 191152014A metadata schema for describing geographic information and services.
ISO 550002024Asset management - Vocabulary, overview and principles and terminology.
ISO 550012024Asset management system - Requirements.
ISO 550022018Asset management - Guidelines for the application of ISO 55001.
ISO/AWI TS 55014DraftGuidance for asset management decision making.
BSI PAS 552008The original Publicly Available Standard (PAS) foundation that the ISO 55000 series was built on.
Unicode16.0.0The Universal Character Set. This is updated frequently (usually every year) to introduce new character glyphs and Emoji symbols. Refer to the Unicode Consortium web site (https://www.unicode.org/) for more information.

There are not very many helpful books covering Asset Management design. This one is well respected and written by an experienced media librarian who has worked for major UK broadcasters.

A Handbook for Media Librarians by Katharine Schopflin.

ISBN: 9781856046305

Published in 2008 by Facet Publishing

Conclusion

There has always been some tension between the counter arguments for building your own system vs. buying an off-the-shelf solution.

A bespoke solution may be more secure since it is a one-off design, and any potential security intrusion exploits will not be in the public domain. The internal workings and attack surface of a commercial product may have well known intrusion exploits. All systems demand ongoing security mintenance; with a bespoke system that falls to you, where an off-the-shelf system should benefit from vendor maintenance.

Bespoke systems can be very closely tailored to your specific needs as opposed to compromising your workflow by forcing it to conform to the functionality of a proprietary design.

There is always going to be some ‘glue’ work to be done when bringing up an asset management system using third-party systems. To start with they can be operated independently. At some point adding a layer of bespoke code to control and aggregate the search and extraction process is beneficial.

Careful configuration is also important and it may be possible to write plug-ins that assist with integration. Output bins might be configured to point at watch folders used by other applications. Conversion tools can be interposed to resolve file formatting conflicts using the same technique.

If you use AI to any great extent, keep the human in the loop until you have verified that the AI is working correctly. Perceptive AI may misinterpret the content occasionally but this is probably not as serious as the hallucinations that Generative AI suffers from. Keeping track of how well it is doing with human quality checking, and scoring the results will lead you to the necessary improvements with a little statistical analysis. If you find some problematic imagery, feed it back to the AI engineers to analyze so they can improve the recognition performance to deal with edge cases.

You might also like...

Production–Delivery Convergence: Part 6 - Designing Experiences That Viewers Trust

Performance reliability is an invisible contract between a streaming service and its customer, and it is fundamental to guaranteeing viewer retention. The problem is that performance isn’t just about delivery. Here we identify where to look and why it’s c…

SMPTE Education Launches Summer 2026 Lineup Of IP And ST 2110 Courses

Boasting two standalone courses, an intensive boot camp, and a hands-on practical lab, SMPTE Education has launched its summer 2026 Lineup of IP and ST 2110 Courses.

Virtual Production For Broadcast: Principles, Terminology & Technology

The technology and techniques of virtual production, from the camera back through the video wall, processors, and rendering servers.

Standards: Video - Advanced Video Coding (AVC)

AVC remains one of the most widely deployed video codecs in the world, but navigating its profiles, levels and signaling mechanisms is far from straightforward.

Live Sports & Monetization: Public Service Broadcasters Maximizing Live Sports Opportunities

PSBs across the world are making the most of limited resources to enrich live sports coverage around ancillary content and platforms, and monetizing the resulting services. Here we focus on the content and coverage rather than technical issues around workflow…