Data Model documentation options

Mikes Notes

Pipi has hundreds of autonomous databases, some with up to 60 tables. The existing Pipi entity engine stores every model, entity, relationship, and column Pipi uses. Developers need a web-based data model explorer to understand the databases they are using. The problem is how to design the user interface. This note examines how other modelling tools present information to develop something familiar and useful for Pipi.

The experimental toy prototypes are here. (The web URLs will likely change in the future)

Resources

Requirements

A data model explorer needs to auto-generate documentation.

These objects are a minimum.
  • Database
  • Schema
  • E/R Diagram
  • Data Type
  • Table
  • Column
  • Domain
  • Relationship
  • Primary Keys (PK)
  • Foreign Keys (FK)
  • Indexes
And be able to provide these model views
  • Conceptual
  • Logical
  • Physical
Other
  • Data Dictionary including i18n
Linked data
The model objects need to automatically link to other documentation
  • DAO (Data Access Objects)
  • Ontologies
  • Use in workflows
  • Variables
  • "JavaDocs" style code packages
  • i18n object names

Examples

Database Workbench

DBeaver


dbSchema



Enterprise Architect


E/R Studio Data Architect

erwin Data Modeller


MS SQL Server Management Studio

MetaBase


Model Right

Navicat

Software Ideas Modeller


SQuirreL

SQLyog


Vertabelo Data Modeller




Fundamentals of Data Visualization

"As objective as data might be, there’s a human factor that is easily overseen when it comes to creating visualizations that accurately reflect it: bias and misunderstandings. Having worked with students and postdocs on thousands of data visualizations over the years, Claus O. Wilke, Professor of Integrative Biology, knows from experience that the same issues arise over and over when it comes to visualizing data."

"In his book Fundamentals of Data Visualization, he collected his accumulated knowledge from these interactions to help everyone create clear, attractive, and convincing data visualizations. You can read the complete manuscript for free on the author’s website." - Smashing Magazine.



Resources

Financial Times Visual Vocabulary

Mikes Notes

Smashing Magazine wrote about the Financial Times Visual Vocabulary, available on GitHub as a chart and a website in multiple languages.

"Violins, doughnuts, pies, slopes — data can be visualized in many ways. But which type of chart should you pick? To help you select the optimal visualization type for your data, the Financial Times Visual Journalism Team published the Financial Times Visual Vocabulary." - Smashing Magazine.

Resources

The material below is copied from the vocabulary.

Financial Times Visual Vocabulary

A poster (available in English, Japanese, traditional Chinese and simplified Chinese) and web site to assist designers and journalists to select the optimal symbology for data visualisations, by the Financial Times Visual Journalism Team.

The FT Visual Vocabulary is at the core of a newsroom-wide training session aimed at improving chart literacy. This learning resource is inspired by the Graphic Continuum by Jon Schwabish and Severino Ribecca. This is not an attempt to teach everyone how to make charts, but how to recognise the opportunities to use them effectively alongside words.

Read the Chart Doctor feature column for full background on why we made this: Simple techniques for bridging the graphics language gap

For D3 templates for producing many of these chart types in FT style, see our Visual Vocabulary repo.

Related reading

The full content of the poster, along with links to related material, including research and examples of best practice. This is a work in progress.

General

  • National Geographic: Taking data visualisation from eye candy to efficiency
  • William S. Cleveland and Robert McGill: Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods
  • Hadley Wickham: A Layered Grammar of Graphics
  • Tracey L. Weissgerber et al: Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm
  • Numeroteca: Uses and abuses of data visualisations in mass media
  • Andy Cotgreave: The inevitability of data visualization criticism
  • Alberto Cairo: "Our reader" won't understand something as complicated as that!
  • Alberto Cairo: Visualization's expanding vocabulary

Deviation

Emphasise variations (+/-) from a fixed reference point. Typically the reference point is zero but it can also be a target or a long-term average. Can also be used to show sentiment (positive/neutral/negative). Example FT uses: Trade surplus/deficit, climate change

Diverging bar

A simple standard bar chart that can handle both negative and positive magnitude values.

  • Chart Doctor: How the FT explained Brexit

Diverging stacked bar

Perfect for presenting survey results which involve sentiment (eg disagree/neutral/agree).

Spine chart

Splits a single value into 2 contrasting components (eg Male/Female)

Surplus/deficit filled line

The shaded area of these charts allows a balance to be shown – either against a baseline or between two series.

Correlation

Show the relationship between two or more variables. Be mindful that, unless you tell them otherwise, many readers will assume the relationships you show them to be causal (i.e. one causes the other). Example FT uses: Inflation & unemployment, income & life expectancy

  • Chart Doctor: The German election and the trouble with correlation

Scatterplot

The standard way to show the relationship between two continuous variables, each of which has its own axis.

  • Chart Doctor: The storytelling genius of unveiling truths through charts
  • Maarten Lambrechts: 7 reasons you should use dot graphs
  • Tim Brock: Too Big Data: Coping with Overplotting
  • Sara Kehaulani Goo: The art and science of the scatterplot
  • Chart Doctor: The storytelling genius of unveiling truths through charts
  • Examples: FT

Line + Column

A good way of showing the relationship between an amount (columns) and a rate (line)

  • Data Revelations: Be Careful with Dual Axis Charts
  • DataHero: The Do’s and Don’ts of Dual Axis Charts
  • Harvard Business Review: Beware Spurious Correlations

Connected scatterplot

Usually used to show how the relationship between two variables has changed over time.

  • Robert Kosara: The Connected Scatterplot for Presenting Paired Time Series
  • Data Revelations: Be Careful with Dual Axis Charts
  • Examples: Washington Post

Bubble

Like a scatterplot, but adds additional detail by sizing the circles according to a third variable

  • Chart Doctor: The storytelling genius of unveiling truths through charts
  • Examples: FT

XY heatmap

A good way of showing the patterns between 2 categories of data, less good at showing fine differences in amounts.

  • Chart Doctor: Use fewer maps to illustrate data better

Ranking

Use where an item’s position in an ordered list is more important than its absolute or relative value. Don’t be afraid to highlight the points of interest. Example FT uses: Wealth, deprivation, league tables, constituency election results

Ordered bar

Standard bar charts display the ranks of values much more easily when sorted into order

Ordered column

See above.

Ordered proportional symbol

Use when there are big variations between values and/or seeing fine differences between data is not so important.

Dot strip plot

Dots placed in order on a strip are a space-efficient method of laying out ranks across multiple categories.

Slope

Perfect for showing how ranks have changed over time or vary between categories.

Lollipop chart

Lollipops draw more attention to the data value than standard bar/column and can also show rank and value effectively.

Distribution

Show values in a dataset and how often they occur. The shape (or ‘skew’) of a distribution can be a memorable way of highlighting the lack of uniformity or equality in the data. Example FT uses: Income distribution, population (age/sex) distribution

  • Joey Cherdarchuk: Visualising distributions

Histogram

The standard way to show a statistical distribution - keep the gaps between columns small to highlight the ‘shape’ of the data

  • Aran Lunzer and Amelia McNamara: Exploring histograms

Boxplot

Summarise multiple distributions by showing the median (centre) and range of the data

Violin plot

Similar to a box plot but more effective with complex distributions (data that cannot be summarised with simple average).

Population pyramid

A standard way for showing the age and sex breakdown of a population distribution; effectively, back to back histograms.

Dot strip plot

Good for showing individual values in a distribution, can be a problem when too many dots have the same value.

Dot plot

A simple way of showing the change or range (min/max) of data across multiple categories.

Barcode plot

Like dot strip plots, good for displaying all the data in a table,they work best when highlighting individual values.

  • Maarten Lambrechts: Interactive strip plots for visualizing demographics

Cumulative curve

A good way of showing how unequal a distribution is: y axis is always cumulative frequency, x axis is always a measure.

Change over Time

Give emphasis to changing trends. These can be short (intra-day) movements or extended series traversing decades or centuries. Choosing the correct time period is important to provide suitable context for the reader. Example FT uses: Share price movements, economic time series

  • Flowing Data: 11 Ways to Visualize Changes Over Time – A Guide

Line

  • The standard way to show a changing time series. If data are irregular, consider markers to represent data points
  • Chart Doctor: A chart’s ability to mislead is off the scale
  • Office for National Statistics: Does the axis have to start at zero? (Part 1 – line charts)
  • Quartz: It's OK not to start your y-axis at zero
  • Vox: Shut up about the y-axis. It should't always start at zeroEmily Schuch: How to Make a Line Chart that Doesn't Lie

Column

Columns work well for showing change over time - but usually best with only one series of data at a time.

  • Chart Doctor: A chart’s ability to mislead is off the scale
  • Office for National Statistics: Does the axis have to start at zero? (Part 2 – bar charts)

Line + column

A good way of showing the relationship over time between an amount (columns) and a rate (line)

Stock price

Usually focused on day-to-day activity, these charts show opening/closing and hi/low points of each day

Slope

Good for showing changing data as long as the data can be simplified into 2 or 3 points without missing a key part of story

Area chart

Use with care – these are good at showing changes to total, but seeing change in components can be very difficult

Fan chart (projection)

Use to show the uncertainty in future projections - usually this grows the further forward to projection

Connected scatterplot

A good way of showing changing data for two variables whenever there is a relatively clear pattern of progression.

Calendar heatmap

A great way of showing temporal patterns (daily, weekly, monthly) – at the expense of showing precision in quantity.

Priestley timeline

Great when date and duration are key elements of the story in the data.

  • Chart Doctor: Communicating with data: Timelines
  • Examples: FT

Circle timeline

Good for showing discrete values of varying size across multiple categories (eg earthquakes by contintent).

Seismogram

Another alternative to the circle timeline for showing series where there are big variations in the data.

Part-to-whole

Show how a single entity can bebroken down into its component elements. If the reader’s interest issolely in the size of the components,consider a magnitude-type chartinstead. Example FT uses: Fiscal budgets, company structures,national election results

  • Flowing Data: 9 Ways to Visualize Proportions – A Guide

Stacked column

A simple way of showing part-to-whole relationships but can be difficult to read with more than a few components.

  • Robert Kosara: Stacked bars are the worst

Proportional stacked bar

A good way of showing the size and proportion of data at the same time – as long as the data are not too complicated.

  • Chart Doctor: How to apply Marimekko to data

Pie

A common way of showing part-to-whole data – but be aware that it’s difficult to accurately compare the size of the segments.

  • Robert Kosara: Ye olde pie chart debate
  • Robert Kosara: Pie Charts – Unloved, Unstudied, and Misunderstood
  • Robert Kosara: An Illustrated Tour of the Pie Chart Study Results
  • David Robinson: How to replace a pie chart
  • Office for National Statistics: The humble pie chart: part 1
  • Office for National Statistics: The humble pie chart: part 2
  • Ian Spence: No humble pie: The origins and usage of a statistical chart
  • Jeff Clark: In defense of pie charts
  • Stephen Few: Save the Pies for Dessert

Donut

Similar to a pie chart – but the centre can be a good way of making space to include more information about the data (eg. total)

Treemap

Use for hierarchical part-to-whole relationships; can be difficult to read when there are many small segments.

Voronoi

A way of turning points into areas – any point within each area is closer to the central point than any other centroid.

Arc

A hemicycle, often used for visualising political results in parliaments.

Gridplot

Good for showing % information, they work best when used on whole numbers and work well in multiple layout form.

Venn

Generally only used for schematic representation

Waterfall

Can be useful for showing part-to-whole relationships where some of the components are negative.

Magnitude

Show size comparisons. These can berelative (just being able to seelarger/bigger) or absolute (need tosee fine differences). Usually theseshow a ‘counted’ number (for example, barrels, dollars or people) rather thana calculated rate or per cent. Example FT uses: Commodity production, marketcapitalisation

Column

The standard way to compare the size of things. Must always start at 0 on the axis

Bar

See above. Good when the data are not time series and labels have long category names.

Paired column

As per standard column but allows for multiple series. Can become tricky to read with more than 2 series.

Paired bar

See above.

Proportional stacked bar

A good way of showing the size and proportion of data at the same time – as long as the data are not too complicated.

  • Chart Doctor: How to apply Marimekko to data

Proportional symbol

Use when there are big variations between values and/or seeing fine differences between data is not so important.

Isotype (pictogram)

Excellent solution in some instances – use only with whole numbers (do not slice off an arm to represent a decimal).

Lollipop chart

Lollipop charts draw more attention to the data value than standard bar/column – does not HAVE to start at zero (but preferable).

Radar chart

A space-efficient way of showing value pf multiple variables– but make sure they are organised in a way that makes sense to reader.

Parallel coordinates

An alternative to radar charts – again, the arrngement of the variables is important. Usually benefits from highlighting values.

Spatial

Used only when precise locations orgeographical patterns in data aremore important to the reader thananything else. Example FT uses: Locator maps, population density,natural resource locations, naturaldisaster risk/impact, catchment areas, variation in election results

  • Chart Doctor: Use fewer maps to illustrate data better
  • Matthew Ericson: When Maps Shouldn’t Be Maps
  • Mapbox: 7 data visualization techniques for location

Basic choropleth (rate/ratio)

The standard approach for putting data on a map – should always be rates rather than totals and use a sensible base geography

  • Vox: The bad map we see every presidential election
  • Vox: This “bad” election map? It’s not so bad.
  • UX•Blog: Telling the truth

Proportional symbol (count/magnitde)

Use for totals rather than rates – be wary that small differences in data will be hard to see.

  • Stephen Few: What Can’t Be Built with Bricks?

Flow map

For showing unambiguous movement across a map.

Contour map

For showing areas of equal value on a map. Can use deviation colour schemes for showing +/- values

Equalised cartogram

Converting each unit on a map to a regular and equally-sized shape – good for representing voting regions with equal value.

  • Chart Doctor: How the FT explained Brexit
  • 5W Blog: The power of cartograms and creating them easily

Scaled cartogram (value)

Stretching and shrinking a map so that each area is sized according to a particular value.

  • Chart Doctor: The search for a better US election map
  • 5W Blog: The power of cartograms and creating them easily
  • Vox: The bad map we see every presidential election

Dot density

Used to show the location of individual events/locations – make sure to annotate any patterns the reader should see.

  • Chart Doctor: The search for a better US election map

Heat map

Grid-based data values mapped with an intensity colour scale. As choropleth map – but not snapped to an admin/political unit.

  • 5W Blog: The power of cartograms and creating them easily

Flow

Show the reader volumes or intensity of movement between two or more states or conditions. These might belogical sequences or geographical locations. Example FT uses: Movement of funds, trade, migrants, lawsuits, information; relationship graphs.

  • RJ Andrews: Picturing the Great Migration

Sankey (aka river plot)

Shows changes in flows from one condition to at least one other; good for tracing the eventual outcome of a complex process.

  • Chart Doctor: Data visualisation: it is not all about technology

Waterfall

Designed to show the sequencing of data through a flow process, typically budgets. Can include +/- components.

Chord

A complex but powerful diagram which can illustrate 2-way flows (and net winner) in a matrix.

Network

Used for showing the strength and inter-connectedness of relationships of varying types.

Todo:

Uncertainty

  • Scientific American: Visualising uncertain weather
  • Oli Hawkins: Animating uncertainty

Animation

  • Chart Doctor: The storytelling genius of unveiling truths through charts
  • Evan Sinar: Use Animation to Supercharge Data Visualization

Interactivity

  • Chart Doctor: Why the FT creates so few clickable graphics
  • Gregor Aisch: In defense of interactive graphics
  • Zan Armstrong: Why choose? Scrollytelling and steppers)

Map projections

Colour

UMBEL

From Wikipedia

"UMBEL (Upper Mapping and Binding Exchange Layer) is a logically organized knowledge graph of 34,000 concepts and entity types that can be used in information science for relating information from disparate sources to one another. It was retired at the end of 2019. UMBEL was first released in July 2008. Version 1.00 was released in February 2011. Its current release is version 1.50.

The grounding of this information occurs by common reference to the permanent URIs for the UMBEL concepts; the connections within the UMBEL upper ontology enable concepts from sources at different levels of abstraction or specificity to be logically related. Since UMBEL is an open-source extract of the OpenCyc knowledge base, it can also take advantage of the reasoning capabilities within Cyc.

UMBEL has two means to promote the semantic interoperability of information:. It is:

  • An ontology of about 35,000 reference concepts, designed to provide common mapping points for relating different ontologies or schema to one another, and
  • A vocabulary for aiding that ontology mapping, including expressions of likelihood relationships distinct from exact identity or equivalence. This vocabulary is also designed for interoperable domain ontologies.

UMBEL is written in the Semantic Web languages of SKOS and OWL 2. It is a class structure used in Linked Data, along with OpenCyc, YAGO, and the DBpedia ontology. Besides data integration, UMBEL has been used to aid concept search, concept definitions, query ranking, ontology integration, and ontology consistency checking. It has also been used to build large ontologies and for online question answering systems.

Including OpenCyc, UMBEL has about 65,000 formal mappings to DBpedia, PROTON, GeoNames, and schema.org, and provides linkages to more than 2 million Wikipedia pages (English version). All of its reference concepts and mappings are organized under a hierarchy of 31 different "super types", which are mostly disjoint from one another. Each of these "super types" has its own typology of entity classes to provide flexible tie-ins for external content. 90% of UMBEL is contained in these entity classes." - Wikipedia

Resources

Linked data

Wikipedia Definition

"In computing, linked data is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages only for human readers, it extends them to share information in a way that can be read automatically by computers. Part of the vision of linked data is for the Internet to become a global database.

Tim Berners-Lee, director of the World Wide Web Consortium (W3C), coined the term in a 2006 design note about the Semantic Web project.

Linked data may also be open data, in which case it is usually described as Linked Open Data.

Principles

In his 2006 "Linked Data" note, Tim Berners-Lee outlined four principles of linked data, paraphrased along the following lines:

  • Uniform Resource Identifiers (URIs) should be used to name and identify individual things.
  • HTTP URIs should be used to allow these things to be looked up, interpreted, and subsequently "dereferenced".
  • Useful information about what a name identifies should be provided through open standards such as RDF, SPARQL, etc.
  • When publishing data on the Web, other things should be referred to using their HTTP URI-based names.

Tim Berners-Lee later restated these principles at a 2009 TED conference, again paraphrased along the following lines:

  • All conceptual things should have a name starting with HTTP.
  • Looking up an HTTP name should return useful data about the thing in question in a standard format.
  • Anything else that that same thing has a relationship with through its data should also be given a name beginning with HTTP. ..." - Wikipedia
Resources
  • https://en.wikipedia.org/wiki/Linked_data

CKAN

Wikipedia Description

"The Comprehensive Knowledge Archive Network (CKAN) is an open-source open data portal for the storage and distribution of open data. Initially inspired by the package management capabilities of Debian Linux, CKAN has developed into a powerful data catalogue system that is mainly used by public institutions seeking to share their data with the general public.

Rufus Pollock developed its first version in 2005-2006. Since its inception, CKAN has evolved and is the leading[citation needed] open data platform software in the world, used by governments including the US and UK, to publish millions of public datasets.

CKAN's codebase is maintained by the Open Knowledge Foundation. The system is used both as a public platform on Datahub and in various government data catalogues, such as the UK's data.gov.uk, the Dutch National Data Register, the United States government's Data.gov and the Australian government's "Gov 2.0".The state government of South Australia also makes government data freely available to the public on the CKAN platform. The Italian government makes available the open data of the Data & Analytics Framework on the CKAN platform.

Internal technology

CKAN's back end, the part running on the Web server, is written mainly in Python. The web pages it offers to users browsers include JavaScript. CKAN maintains information about the data sets to be offered to users in PostgreSQL databases. Searches are implemented by Solr. CKAN installations can be queried through Web APIs.

Future of the project

The CKAN Stewardship proposal jointly put forward by Link Digital and Datopian received support from the Open Knowledge Foundation Board. In appointing joint stewardship put up jointly by Link Digital and Datopian, the Board felt there was a clear practical path with strong leadership and committed funding to see CKAN grow and prosper in the years to come. The Open Knowledge Foundation will remain the ‘purpose trustee’ to ensure the Stewards remain true to the purpose and ethos of the CKAN project.

Similar projects and alternatives

  • Dataverse provides similar functions and is widely used for open data.
  • DKAN is a Drupal-based open data portal based on CKAN." - Wikipedia

Resources

Examples of Use

Open Knowledge Foundation

"Many of Open Knowledge Foundation's projects are technical in nature. Its most prominent project, CKAN, is used by many of the world's governments to host open catalogues of data that their countries possess.

The organisation tends to support its aims by hosting infrastructure for semi-independent projects to develop. This approach to organising was hinted as one of its earliest projects was a project management service called KnowledgeForge, which runs on the KForge platform. KnowledgeForge allows sectoral working groups to have space to manage projects related to open knowledge. More widely, the project infrastructure includes both technical and face-to-face aspects. The organisation hosts several dozen mailing lists for virtual discussion, utilises IRC for real-time communications and also hosts events." - Wikipedia

Resources

Aims

"The aims of Open Knowledge Foundation are:
  • Promoting the idea of open knowledge, both what it is, and why it is a good idea.
  • Running open knowledge events, such as OKCon.
  • Working on open knowledge projects, such as Open Economics or Open Shakespeare.
  • Providing infrastructure, and potentially a home, for open knowledge projects, communities and resources. For example, the KnowledgeForge service and CKAN.
  • Acting at UK, European and international levels on open knowledge issues." - Wikipedia

Open Knowledge Foundation

Vision

"Our vision is that openness and open knowledge are adopted by every government, institution and movement to ensure access to critical information that will empower humans to solve the most pressing problems of our times, leading to a sustainable, fair and open future for all." - OKFN

"The world’s institutions are decaying rapidly by embracing a culture where knowledge is privatised, artificially restricted behind paywalls or secrecy laws, violently extracted from groups or even extinct due to neglect and austerity. The future world is being built by corporations in closed virtual reality spaces, blocking all possibilities of generativity and democracy from flourishing, or in closed rooms and opaque systems. 

The root of the problem is an institutional architecture designed for that to happen, limited literacies and skills of people to act on the problem and no effective models and tools to replace the current systems. Our hope and focus will be to reverse it and push for openness as a design principle to build future institutions. 

We believe it is time for new rules, models and tools that generate improved conditions for knowledge to be shared by all in a fair, free and open future. In the face of rising inequality, global threats to our shared environment, and fading social consensus, open knowledge is not simply the opposite of closed societies: it opposes misleading facts (about societal issues), illegible data (from scientific research), privatised information (held by tech platforms) and of course withheld documents (about government or corporate acts). Those elements can exist even in a technically 'open society'. Our mission is to enable a future where communities, tools and best practices exist to keep those threats to the health of our societies at bay. And do it as agile, scalable, and adaptable to local circumstances, as possible." - OKFN

Digital Public Goods Alliance

Mikes Notes

Here are some notes copied from the Digital Public Goods Alliance. They would be good to incorporate into the Ajabbi Handbook. Except for the core, most of Ajabbi will be open-source.

Resources

Objectives

The five year objectives of the Digital Public Goods Alliance are:

  • Digital public goods with high-potential for addressing critical development needs and urgent global challenges are discoverable, sustainably managed, and accessible for government institutions and other relevant implementing organisations.
  • UN-institutions, multilateral development banks and other public and private institutions that are of high relevance for supporting implementation of digital technologies have the knowledge, capacity, and incentives to effectively promote and support adoption of DPGs.
  • Government institutions have the information, motivation, and capacity to effectively implement DPGs that address country needs, including to plan, deploy, maintain, and evolve their digital public infrastructure.
  • Countries have public sector capacity and vibrant commercial ecosystems in place to create, maintain, implement, and incubate DPGs locally.

Tips For Open Standards

Open standards establish protocols and building blocks that can help make digital public goods more functional and interoperable. This not only streamlines product development, it removes vendor-imposed boundaries to read or write data files by improving data exchange. Below are some of the common open standards by category:

Accessibility

  • WCAG 2.0/2.1 (Web Content Accessibility Guidelines)

Security

  • ISO/IEC 27001 (Information Security Management)
  • ISO/IEC 27018:2019 (Information technology — Security techniques — Code of practice for protection of personally identifiable information (PII) in public clouds acting as PII processors)
  • PKI
  • HTTPS
  • SSL
  • SSH
  • GPG
  • RS256
  • HS256
  • AES
  • ES256

Authentication & Authorization

  • OAuth 2
  • OIDC (OpenID Connect)
  • JWT (JSON Web Tokens)
  • SAML (Security Assertion Markup Language)
  • XACML 3.0 (eXtensible Access Control Markup Language)

Internationalization (i18n)

  • UTF-8
  • ISO-8859-1
  • ASCII

Web standards

  • HTML
  • CSS
  • ECMAScript (ES 5/6/7)
  • Latex

Application Programming Interfaces (APIs)

  • OpenAPI
  • GraphQL

Data Exchange/ Configuration formats

  • JSON
  • YAML
  • XML
  • TOML
  • CSV
  • TIFF
  • HDF5
  • RDF

Geographic Information System (GIS)

  • GeoPackage
  • GeoTIFF

Software Testing

  • IEEE829
  • ISO/IEC/IEEE29119

Business Process Modelling

  • BPMN 2.0

Credentialing

  • W3C VC

Standard Content formats

  • PDF
  • H5P
  • ePub
  • WebM

Multimedia

  • SVG (Scalable Vector Graphics)
  • PNG (Portable Network Graphics)
  • JPEG (Joint Photographic Experts Group)
  • Ogg MP3 (Moving Picture Experts Group: Audio Layer III)
  • FLAC (Free Lossless Audio Codec)
  • H.264 (H.264/MPEG-4 AVC)
  • AAC (Advanced Audio Coding)
  • MP4 (MPEG-4 Part 14)

Virtual Reality/ Augmented Reality (VR /AR)

  • WebXR
  • IEEE Digital Reality standards

Computer Communications Protocols

  • WebSocket
  • Whistleblowing management systems
  • ISO 37002:2021 (Whistleblowing management systems — Guidelines)

Sector-specific standards

  • FHIR (Fast Healthcare Interoperability Resources) - Healthcare
  • openEHR - Healthcare
  • OCDS (Open Contracting Data Standard) - Open government
  • Open Fiscal Data Package - Open government
  • International Aid Transparency Initiative (IATI) Standard - Aid
  • GTFS (General Transit Feed Specification) - Mobility