RSS

API Agreggation News

These are the news items I've curated in my monitoring of the API space that have some relevance to the API aggregation conversation and I wanted to include in my research. I'm using all of these links to better understand how the space is testing their APIs, going beyond just monitoring and understand the details of each request and response.

Challenges When Aggregating Data Published Across Many Years

My partner in crime is working on a large data aggregation project regarding ed-tech funding. She is publishing data to Google Sheets, and I’m helping her develop Jekyll templates she can fork and expand using Github when it comes to publishing and telling stories around this data across her network of sites. Like API Evangelist, Hack Education runs as a network of Github repositories, with a common template across them–we call the overlap between API Evangelist, Contrafabulists.

One of the smaller projects she is working on as part of her ed-tech funding research involves pulling the grants made by the Gates Foundation since the 1990s. Similar to my story a couple weeks ago about my friend David Kernohan, where he was wanting to pull data from multiple sources, and aggregate into a single, workable project. Audrey is looking to pull data from a single source, but because the data spans almost 20 years–it ends up being a lot like aggregating data from across multiple sources.

A couple of the challenges she is facing trying to gather the data, and aggregate as a common dataset are:

  • PDF - The enemy of any open data advocate is the PDF, and a portion of her research data data is only available in PDF format which translates into a good deal of manual work.
  • Search - Other portions of the data is available via the web, but obfuscated behind search forms requiring many different searches to occur, with paginated results to navigate.
  • Scraping - The lack of APIs, CSV, XML, and other machine readable results raises the bar when it comes to aggregating and normalizing data across many years, making scraping a consideration, but because of PDFs, and obfuscated HTML pages behind a search, even scraping will have a significant costs.
  • Format - Even once you’ve aggregated data from across the many sources, there is a challenge with it being in different formats. Some years are broken down by topic, while others are geographically based. All of this requires a significant amount of overhead to normalize and bring into focus.
  • Manual - Ultimately Audrey has a lot of work ahead of her, manually pulling PDFs and performing searches, then copying and pasting data locally. Then she’ll have to roll up her sleeves to normalize all the data she has aggregated into a single, coherent vision of where the foundation has put its money.

Data research takes time, and is tedious, mind numbing work. I encounter many projects like hers where I have to make a decision between scraping or manually aggregating and normalizing data–each project will have it’s own pros and cons. I wish I could help, but it sounds like it will end up being a significant amount of manual labor to establish a coherent set of data in Google Sheets. Once, she is done though, she has all the tools in place to publish as YAML to Github, and get to work telling stories around the data across her work using Jekyll and Liquid. I’m also helping her make sure she has a JSON representation of each of her data projects, allowing others to build on top of her hard work.

I wish all companies, organizations, institutions, and agencies would think about how they publish their data publicly. It’s easy to think that data stewards will have ill intentions when it comes to publishing data in a variety of formats like they do, but more likely it is just a change of stewardship when it comes to managing and publishing the data. Different folks will have different visions of what sharing data on the web needs to look like, and have different tools available to them, and without a clear strategy you’ll end up with a mosaic of published data over the years. Which is why I’m telling her story. I am hoping to possibly influence one or two data stewards, or would-be data stewards when it comes to the importance of pausing for a moment and thinking through your strategy for standardizing how you store and publish your data online.


Simple APIs With Jekyll And Github With Data Managed Via Google Spreadsheets

I'm always looking for simpler, and cheaper ways of doing APIs that can help anyone easily manage data while making it available in both a human and machine readable way--preferably something developers and non-developers both will find useful. I've pushed forward my use of Github when it comes to managing simple datasets, and have a new approach I want to share, and potentially use across other projects.

You can find a working example of this in action with my OpenAPI Toolbox, where I'm looking to manage and share a listing of tooling that is built on top of the OpenAPI specification. Like the rest of my API research, I am looking manage the data in a simple and cheap way that I can offload the storage, compute, and bandwidth to other providers, preferably ones that don't cost me a dime. While not a solution that would work in every API scenario, I am pretty happy with the formula I've come up with for my OpenAPI Toolbox.

Data Storage and Management In Google Sheets
The data used in the OpenAPI Toolbox comes from a public Google Sheet. I manage all the tools in the toolbox via this spreadsheet, tracking title, description, URL, image, organization, types, tags, and license using the spreadsheet. I have two separate worksheets, one of which tracks details on the organizations, and the other keeping track of each individual tool in the toolbox. This allows for the data to be managed by anyone I give access to the sheet using Google Docs, offloading storage and data management to Google. Sure, it has its limitations, but for simple datasets, it works better than a database in my opinion.

Website and Basic API Hosting Using Github
First, and foremost the data in the OpenAPI Toolbox is meant to be accessible by any human on the web. Github, using their Github Pages solution, combined with the static website tool Jekyll, provides a rich environment for managing this data-driven toolbox. Jekyll provides the ability to store YAML data in its _data folder, which I can then use across static HTML pages which display the data using Liquid syntax. This approach to managing data allows for easy publishing of static representations in HTML, JSON, and YAML, making the data easily consumable by humans and machines, in an environment that is version controlled, forkable, and free for publicly available projects.

JavaScript Spreadsheet Sync To YAML Data Store
To keep the data in the Google Spreadsheet in sync with the YAML data store in the Github hosted repository I use a simple JavaScript driven page on the website. To make it work you have to provide a valid Github OAuth token to be passed along as query string like this http://openapi.toolbox.apievangelist.com/pull-spreadsheet/?token=[github token]. The token can be acquired by doing the usual OAuth dance with Github or using the Github account of any user where you can issue personal tokens. If the user is a valid contributor on the repository, the JavaScript will pull a recent copy of the data in the Google Spreadsheet, and publish as YAML in the _data folder for the toolbox repository successfully--otherwise, it just throws an error.

HTML Toolbox For Humans To Browse The Toolbox
Jekyll provides a static website that acts as the public face for the OpenAPI Toolbox. The home page provides icon links to each of the types of tools I have indexed, as well as to specific tags I've selected, such as the programming language of each tool. Each page of the website is an HTML page that uses Liquid to display data stored in the central YAML store. Liquid handles the filtering of data by type, tags, or any other criteria I choose. Soon I will be adding a search, and other ways to browse the data in the toolbox as the data store grows, and I obtain more data points to slice and dice things on. 

JSON API To Put To Use Across Other Applications
To provide the API-esque functionality I also use Liquid to display data from the YAML data store, but instead of publishing as HTML, I publish as JSON, essentially providing a static API facade. The primary objective of this type of implementation is to allow for GET requests on a variety of machine-readable paths for the toolbox. I published a full JSON listing of the entire toolbox, as well as more precise paths for getting at types of tools, and specific programming language tools. Similar to the human side of the toolbox, I will be adding more paths as more data points become available in the OpenAPI toolbox data store.

Documentation Using Liquid and OpenAPI Definition
Rather than just making the data available via JSON files, I wanted to also provide simple API documentation demonstrating what was possible with the data stored in the toolbox. I crafted an OpenAPI for the OpenAPI Toolbox API, providing a machine readable definition of all the paths available. Again, using Liquid I generate simple API documentation and schema, that actually allows you to make calls against the API, using a simple interactive JavaScript interface. While the OpenAPI Toolbox is technically static, using Liquid and OpenAPI I was able to mimic much of the functionality developers are used to when it comes to API integration.

Project Support and Road Map Using Github Issues
As with all of my projects I am using the underlying issue management system to help me manage support and the roadmap for the project. Anyone can submit an issue regarding a tool they'd like to see in the toolbox, regarding API integration, or possibly new APIs they would like to see published. I can use the Github issue management to handle general support requests, and communication around the project, as well as incrementally manage the data, schema, website, and API for the toolbox. 

Indexed In Machine Readable Way With APIs.json
The entire project is indexed using APIs.json, providing metadata for the project as well as other indexes for the API, support, and other aspects of operating the project. APIs.json is meant to provide a machine readable index for not just the API, which is already defined using OpenAPI, but for the rest of the project, including documentation and support, and eventually a road map, blog, and other supporting elements. Using the APIs.json index, other systems can easily discover the API, and programmatically access the data via the APIs, or even access the repository for the spreadsheet via the Github API, or the Google Sheet via its API--all the information is available in the APIs.json for use.

A Free Forkable Way To Manage Simple Data And APIs
This approach to doing APIs won't' be something you will want to do for every API implementation, but for simple data-driven projects like the OpenAPI Toolbox, it works great. Google Sheets and Github are both free, so I am offloading the storage, hosting, and bandwidth to another provider, while I am managing things in a way that any tech-savvy user could engage with. Anyone could manage entries in the toolbox using the Google Sheet and even engage with humans, and other applications via the published project toolbox.

I am going to continue evolving this approach to fit some of my other data-driven projects. I really like having all my projects self-contained as individual repositories, and the public side of things running using Jekyll--the entire API Evangelist network runs this way. I also like having the data managed in individual Google Sheets like this. it gives me simple data stores that I can easily manage with the help of other data stewards. Each of the resulting projects exists as a static representation of each data set--in this case an OpenAPI toolbox. I have many other toolboxes, toolkits, curriculum, and API research areas that are data driven, and will benefit from this approach.

What really makes me smile about this is that each project has an API representation of its core. Sure, I don't have POST, PUT, and DELETE capabilities for these APIs, or even advanced search capabilities, but for projects that are heavily read only--this works just fine. If you think about it though, I can easily access the data for reading and writing using the Google Sheets or Github APIs, depending on which layer I want to get access at. Really I am just looking to allow for easy management of data using Google Sheets, and simple publishing as YAML, JSON, and HTML, so that humans can browse, as well as put to use in other applications.


Adding An Extensions Category To The OpenAPI Toolbox

I added another type of tool to my OpenAPI Toolbox, this time it is extensions. They used to be called Swagger vendor extensions, and now they are simply called OpenAPI extensions, which allow any implementor to extend the schema outside the current version of the API specification. All you do to add an OpenAPI extension is prepend x- to any value that you wish to include in your OpenAPI, and the validator will overlook it as part of the specification.

I have a whole list of vendor extensions I'd like to add, but I've started with a handful from Microsoft Flow, and my friends over at APIMATIC. Two great examples of how OpenAPI extensions can be used in the API lifecycle. In this case, one is for integration platform as a service (iPaaS), and the other is SDK generation and continuous integration. Both vendors needed to extend the specification to meet a specific need, so they just extended it as required--you can find the extensions in the new section of the toolbox.

My goal in adding the section to the OpenAPI toolbox is to highlight how people are evolving the specification outside the core working group. While some of the extensions are pretty unique, some of them have a potential common purpose. I will be adding some discovery focused extensions next week from the OpenAPI directory APIs.guru, which I will be adopting and using in my own definitions to help me articulate the provenance of any OpenAPI definition in my catalog(s). Plus, I find it to be a learning experience to see how different vendors are putting them to work. 

If you know of any OpenAPI extensions that are not in the toolbox currently feel free to submit an issue on the Github repository for the project. I'd like to evolve the collection to be a comprehensive look at how OpenAPI extensions are being used across the sector, from a diverse number of providers. I'm going to be teaching my own OpenAPI crawler to identify extensions within any OpenAPI it comes across, and automatically submit an issue on the toolbox, rapidly expanding the discovery of how they are used across a variety of implementations in coming months.


The List Of API Signals I Track On In My API Stack Research

I keep an eye on several thousand companies as part of my research into the API space and publish over a thousand of these profiles in my API Stack project. Across the over 1,100 companies, organizations, institutions, and government agencies I'm regularly running into a growing number of signals that tune me into what is going on with each API provider, or service provider. 

Here are the almost 100 types of signals I am tuning into as I keep an eye on the world of APIs, each contributing to my unique awareness of what is going on with everything API.

  • Account Settings (x-account-settings) - Does an API provider allow me to manage the settings for my account?
  • Android SDK (x-android-sdk) - Is there an Android SDK present?
  • Angular (x-angularjs) - Is there an Angular SDK present?
  • API Explorer (x-api-explorer) - Does a provider have an interactive API explorer?
  • Application Gallery (x-application-gallery) - Is there a gallery of applications build on an API available?
  • Application Manager (x-application-manager) - Does the platform allow me to management my APIs?
  • Authentication Overview (x-authentication-overview) - Is there a page dedicated to educating users about authentication?
  • Base URL for API (x-base-url-for-api) - What is the base URL(s) for the API?
  • Base URL for Portal (x-base-url-for-portal) - What is the base URL for the developer portal?
  • Best Practices (x-best-practices) - Is there a page outlining best practices for integrating with an API?
  • Billing history (x-billing-history) - As a developer, can I get at the billing history for my API consumption?
  • Blog (x-blog) - Does the API have a blog, either at the company level, but preferably at the API and developer level as well?
  • Blog RSS Feed (x-blog-rss-feed) - Is there an RSS feed for the blog?
  • Branding page (x-branding-page) - Is there a dedicated branding page as part of API operations?
  • Buttons (x-buttons) - Are there any embeddable buttons available as part of API operations.
  • C# SDK (x-c-sharp) - Is there a C# SDK present?
  • Case Studies (x-case-studies) - Are there case studies available, showcasing implementations on top of an API?
  • Change Log (x-change-log) - Does a platform provide a change log?
  • Chrome Extension (x-chrome-extension) - Does a platform offer up open-source or white label chrome extensions?
  • Code builder (x-code-builder) - Is there some sort of code generator or builder as part of platform operations?
  • Code page (x-code-page) - Is there a dedicated code page for all the samples, libraries, and SDKs?
  • Command Line Interface (x-command-line-interface) - Is there a command line interface (CLI) alongside the API?
  • Community Supported Libraries (x-community-supported-libraries) - Is there a page or section dedicated to code that is developed by the API and developer community?
  • Compliance (x-compliance) - Is there a section dedicated to industry compliance?
  • Contact form (x-contact-form) - Is there a contact form for getting in touch?
  • Crunchbase (x-crunchbase) - Is there a Crunchbase profile for an API or its company?
  • Dedicated plans pricing page (x-dedicated-plans--pricing-page)
  • Deprecation policy (x-deprecation-policy) - Is there a page dedicated to deprecation of APIs?
  • Developer Showcase (x--developer-showcase) - Is there a page that showcases API developers?
  • Documentation (x-documentation) - Where is the documentation for an API?
  • Drupal (x-drupal) - Is there Drupal code, SDK, or modules available for an API?
  • Email (x-email) - Is an email address available for a platform?
  • Embeddable page (x-embeddable-page) - Is there a page of embeddable tools available for a platform?
  • Error response codes (x-error-response-codes) - Is there a listing or page dedicated to API error responses?
  • Events (x-events) - Is there a calendar of events related to platform operations?
  • Facebook (x-facebook) - Is there a Facebook page available for an API?
  • Faq (x-faq) - Is there an FAQ section available for the platform?
  • Forum (x-forum) - Does a provider have a forum for support and asynchronous conversations?
  • Forum rss (x-forum-rss) - If there is a forum, does it have an RSS feed?
  • Getting started (x-getting-started) - Is there a getting started page for an API?
  • Github (x-github) - Does a provider have a Github account for the API or company?
  • Glossary (x-glossary) - Is there a glossary of terms available for a platform?
  • Heroku (x-heroku) - Are there Heroku SDKs, or deployment solutions?
  • How-To Guides (x-howto-guides) - Does a provider offer how-to guides as part of operations?
  • Interactive documentation (x-interactive-documentation) - Is there interactive documentation available as part of operatoins?
  • IoS SDK (x-ios-sdk) - Is there an IoS SDK for Objective-C or Swift?
  • Issues (x-issues) - Is there an issue management page or repo for the platform?
  • Java SDK (x-java) - Is there a Java SDK for the platform?
  • JavaScript API (x-javascript-api) - Is there a JavaScript SDK available for a platform?
  • Joomla (x-joomla) - Is there Joomla plug for the platform?
  • Knowledgebase (x-knowledgebase) - Is there a knowledgebase for the platform?
  • Labs (x-labs) - Is there a labs environment for the API platform?
  • Licensing (x-licensing) - Is there licensing for the API, schema, and code involved?
  • Message Center (x-message-center) - Is there a messaging center available for developers?
  • Mobile Overview (x-mobile-overview) - Is there a section or page dedicated to mobile applications?
  • Node.js (x-nodejs) - Is there a Node.js SDK available for the API?
  • Oauth Scopes (x-oauth-scopes) - Does a provider offer details on the available OAuth scopes?
  • Openapi spec (x-openapi-spec) - Is there an OpenAPI available for the API?
  • Overview (x-overview) - Does a platform have a simple, concise description of what they do?
  • Paid support plans (x-paid-support-plans) - Are there paid support plans available for a platform?
  • Postman Collections (x-postman) - Are there any Postman Collections available?
  • Partner (x-partner) - Is there a partner program available as part of API operations?
  • Phone (x-phone) - Does a provider publish a phone number?
  • PHP SDK (x-php) - Is there a PHP SDK available for an API?
  • Privacy Policy (x-privacy-policy-page) - Does a platform have a privacy policy?
  • PubSub (x-pubsubhubbub) - Does a platform provide a PubSub feed?
  • Python SDK (x-python) - Is there a Python SDK for an API?
  • Rate Limiting (x-rate-limiting) - Does a platform provide information on API rate limiting?
  • Real Time Solutions (x-real-time-page) - Are there real-time solutions available as part of the platform?
  • Road Map (x-road-map) - Does a provider share their roadmap publicly?
  • Ruby SDK (x-ruby) - Is there a Ruby SDK available for the API?
  • Sandbox (x-sandbox) - Is there a sandbox for the platform?
  • Security (x-security) - Does a platform provide an overview of security practices?
  • Self-Service registration (x-self-service-registration) - Does a platform allow for self-service registration?
  • Service Level Agreement (x-service-level-agreement) - Is an SLA available as part of platform integration?
  • Slideshare (x-slideshare) - Does a provider publish talks on Slideshare?
  • Stack Overflow (x-stack-overflow) - Does a provider actively use Stack Overflow as part of platform operations?
  • Starter Projects (x-starter-projects) - Are there start projects available as part of platform operations?
  • Status Dashboard (x-status-dashboard) - Is there a status dashboard available as part of API operations.
  • Status History (x-status-history) - Can you get at the history involved with API operations?
  • Status RSS (x-status-rss) - Is there an RSS feed available as part of the platform status dashboard?
  • Support Page (x-support-overview-page) - Is there a page or section dedicated to support?
  • Terms of Service (x-terms-of-service-page) - Is there a terms of service page?
  • Ticket System (x-ticket-system) - Does a platform offer a ticketing system for support?
  • Tour (x-tour) - Is a tour available to walk a developer through platforms operations?
  • Trademarks (x-trademarks) - Is there details about trademarks, and how to use them?
  • Twitter (x-twitter) - Does a platform have a Twitter account dedicated to the API or even company?
  • Videos (x-videos) - Is there a page, YouTube, or other account dedicated to videos about the API?
  • Webhooks (x-webhook) - Are there webhooks available for an API?
  • Webinars (x-webinars) - Does an API conduct webinars to support operations?
  • White papers (x-white-papers) - Does a platform provide white papers as part of operations?
  • Widgets (x-widgets) - Are there widgets available for use as part of integration?
  • Wordpress (x-wordpress) - Are there WordPress plugins or code available?

There are hundreds of other building blocks I track on as part of API operations, but this list represents the most common, that often have dedicated URLs available for exploring, and have the most significant impact on API integrations. You'll notice there is an x- representation for each one, which I use as part of APIs.json indexes for all the APIs I track on. Some of these signal types are machine readable like OpenAPIs or a Blog RSS, with others machine readable because there is another API behind, like Twitter or Github, but most of them are just static pages, where a human (me) can visit and stay in tune with signals.

I have two primary objectives with this work: 1) identify the important signals, that impact integration, and will keep me and my readers in tune with what is going on, and 2) identify the common channels, and help move the more important ones to be machine-readable, allowing us to scale the monitoring of important signals like pricing and terms of service. My API Stack research provides me wit a nice listing of APIs, as well as more individualized stacks like Microsoft, Google, Microsoft, and Facebook, or even industry stacks like SMS, Email, and News. It also provides me with a wealth of signals we can tune into better understand the scope and health of the API sector, and any individual business vertical that is being touched by APIs.


Participating In The OpenAPI Feedback Loop

When you are an individual in a sea of tech giants, and startups who are moving technical conversations forward, it can be easy to just sit back, stay quiet, and go with the flow. As a single person, it feels like our voice will not be heard, or even listened to when it comes to moving forward standards and specifications like the OpenAPI, but in reality, every single voice that speaks up is important, and has the potential to bring a new perspective regarding what the future should hold when it comes to the roadmap.

If you are building any services or tooling that supports version 2.0 of the OpenAPI specification and will be looking to evolve your services or tooling to support version 3.0, you need to make sure and share your views. No matter where you are in the development of your tooling, planning or even deployment, you should make sure you gather and share your thoughts with the OpenAPI Initiative (OAI)--they have a form for tooling developers to submit their feedback and details about what you are up to.

Whether or not you submit your OAI tooling and service plans via the form they provide, you should be also telling your story on your blog. You don't have to have a big audience for your blog, you just need to make sure and publicly share the details of your tools and services, and your perspective of both the OpenAPI 2.0 and 3.0 versions. If you tell your story on your blog, and Tweet or email a link to me, I may even craft my own story based on your perspective, and publish to API Evangelist, and put in my OpenAPI Toolbox. Storytelling around the specification plays an important role in helping evolve the specification, as well as help onboard other folks to the API specification format.

As the only individual in the OAI, I can testify that I often feel like my voice is too small to make a difference. This is not true. Whether it's via the Open API Github repo, directly via OpenAPI tooling feedback forms, or even via our blogs on the open Internet, your perspective on OpenAPI matters. Makes sure it gets heard--if you don't step up and share via one of these open channels, you are guaranteeing that you won't be heard, and your views definitely will not matter. Make sure you step up, there is too much at stake when it comes to API definitions right now


OpenAPI-Driven Documentation For Your API With ReDoc

ReDoc is the responsive, three-panel, OpenAPI specification driven documentation for your API that you were looking for. Swagger UI is still reigning king when it comes to API documentation generated using the OpenAPI Spec, but ReDoc provides a simple, attractive, and clean alternative to documentation.

ReDoc is deployable to any web page with just two tags--with the resulting documentation looking attractive on both web and mobile devices. Now you can have it all, your API documentation looking good, interactive, and driven by a machine-readable definition that will help you keep everything up to date.

All you need to fire up ReDoc is two lines of HTML on your web page:

The quickest way to deploy ReDoc is using the CDN step shown above, but they also provide bower or npm solutions, if that is your desire. There is also a Yeoman generator to help you share your OpenAPIs that are central of your web application operation, something we will write about in future posts here on the blog.

ReDoc leverages a custom HTML tag, and provides you with a handful of attributes for defining, and customizing their documentation, including specurl, scroll-y-offset, suppress-warnings, lazy-rendering, hid-hostname, and expand-responses--providing some quick ways to get exactly what you need, on any web page.

There is a handful of APIs who have put ReDocs to use as API documentation for their platform:

There also provide a live demo of ReDoc, allowing you to kick the tires some more before you deploy, and make sure it does what you will need it to before you fork.

ReDoc provides a simple, OpenAPI spec compliant way of delivering attractive, interactive, responsive and up to date documentation that can be deployed anywhere, including integration into your existing continuous integration, and API lifecycle. ReDoc reflects a new generation of very modular, plug and play API tooling that can be put to use immediately as part of an OpenAPI Spec-driven web, mobile, and device application development cycle(s).

ReDoc is available on Github: https://github.com/Rebilly/ReDoc, as an open source solution brought to you by Rebilly, “the world's first subscription and recurring profit maximization company".


Quantifying The Data A Company Possesses Using APIs

Profiling APIs always provides me with a nice bulleted list of what a company does or doesn't do. In my work as the API Evangelist, I can read marketing and communications to find out what a company does, but I find that profiling their APIs provides a more honest view of what is going on. The lack of a public API always sets the tone for how I view what a company is up to, but when there is a public API, profiling it always provides a nice distillation of what a company does, in a nice bulleted list I can share with my readers.

When I profile the APIs of companies like Amazon, Google, and Microsoft, I come out of it with a nice bulleted list of what is possible, but when I go even further, making sure each API profile has accompanying schema definitions, a nice list of what data company begins to emerge. When I profile an API using OpenAPI I always start by profiling the request layer of an API, the paths, parameters, and other elements. Next, I get to work describing the schema definitions of data used in these requests, as well as the structure of the responses--providing me with a nice bulleted list of the data that a company has. 

You can see this in action with my Facebook API profiling work. There is a bulleted list of what is possible (API definition), as well as what data is sent, received, and stored (API schema). This work provides me with a nice look at the data Facebook gathers and stores about everyone. It is FAR from a complete picture of the data Facebook gathers, but it does provide us with a snapshot to consider, as well as a model we can ask Facebook to share more schema about the data points that they track. API and data specification formats like JSON Schema, and OpenAPI provides us with a toolbox to help us quantify and share the details of what data a company has, and what is possible when it comes to using this data in web, mobile, and device based applications.

I fully aware of the boldness of this statement, but I feel that ALL companies should have a public API definition, including a catalog of the schema for data in use. Ideally, this schema would employ commonly used standards like Schema.org, but just having a machine-readable catalog of the schema would go a long way to helping pull back the curtain of how companies are using our data. I am not asking for companies to make data public, I am asking for companies to make the schema for this data public, showing what they track and store about us. I know many people view this as intellectual property, but in an increasingly un/insecure online world of digital privacy, we are going to have to begin pulling back the curtain a little bit, otherwise, a rich environment for exploitation and abuse will continue to develop.


Quantifying The API Landscape Across Amazon, Google, and Microsoft

I work to develop OpenAPI definitions for 3rd party APIs because it helps me understand what is being offered by a company. Even when I'm able to autogenerate an OpenAPI for an API, or come across an existing one, I still spend time going through the finer details of what an API does, or doesn't do. I find the process to be one of the best ways to learn about an API, stopping short of actually integrating with it.

Over the last couple of months, I've aggregated, generated, and crafted OpenAPI and APIs.json definitions for the top three cloud API providers out there. I wanted to be able to easily see the surface area for as many of the APIs as I could find for these three companies:

I learned a lot about all three providers in the process. I filled my notebook with stories about their approaches. I also ended up with three separate Github repositories with APIs.json indexed OpenAPI definitions for as many of their APIs as I could process. They are far from complete, but I feel like they paint a pretty interesting picture of the API landscape across these three tech giants.

So far there are 6,420 paths across 181 individual services. I'm still working on the summary and tags for each path, which are the two most important elements for me. I think the list of 6,420 actions you can take via an API across three of the biggest cloud providers gives us a lot of insight into what these companies are up to. There are a lot of valuable resources in there, ranging from cloud to machine learning. These three projects are an going part of my API stack research, and I will be adding to them as I have time. I'm looking to keep developing simple JavaScript and Liquid tooling on top of the repos, and YAML data behind--further helping me make sense of Amazon, Google, and Microsoft APIs. 


Expressing What An API Does As Well As What Is Possible Using OpenAPI

I am working to update my OpenAPI definitions for AWS, Google, and Microsoft using some other OpenAPIs I've discovered on Github. When a new OpenAPI has entirely new paths available, I just insert them, but when it has an existing path I have to think more critically about what is next. Sometimes I dismiss the metadata about the API path as incomplete or lower quality than the one I have already. Other times the content is actually more superior than mine, and I incorporate it into my work. Now I'm also finding that in some cases I want to keep my representation, as well as the one I discovered, side by side--both having value.

This is one reason I'm not 100% sold on the fact that just API providers should be crafting their own OpenAPis--sure, the API space would be waaaaaay better if ALL API providers had machine readable OpenAPIs for all their services, but I would want it to end here. You see, API providers are good (sometimes) at defining what their API does, but they often suck at telling you what is possible--which is why they are doing APIs. I have a lot of people who push back on me creating OpenAPIs for popular APIs, telling me that API providers should be the ones doing the hard work, otherwise it doesn't matter. I'm just not sold that this is the case, and there is an opportunity for evolving the definition of an API by external entities using OpenAPI.

To help me explore this idea, and push the boundaries of how I use OpenAPI in my API storytelling, I wanted to frame this in the context of the Amazon EC2 API, which allows me to deploy a single unit of compute into the cloud using an API, a pretty fundamental component of our digital worlds. To make any call against the Amazon EC2 I send all my calls to a single base URL:

ec2.amazonaws.com

With this API call I pass in the "action" I'd like to be taken:

?Action=RunInstances

Along with this base action parameter, I pass in a handful of other parameters to further define things:

&ImageId=ami-60a54009&MaxCount=1&KeyName=my-key-pair&Placement.AvailabilityZone=us-east-1d

Amazon has never been known for superior API design, but it gets the job done. With this single API call I can launch a server in the clouds. When I was first able to do this with APIs, is when the light really went on in my head regarding the potential of APIs. However, back to my story on expressing what an API does, as well as what is possible using OpenAPI. AWS has done an OK job at expressing what Amazon EC2 API does, however they suck at expressing what is possible. This is where API consumers like me step up with OpenAPI and provide some alternative representations of what is possible with the highly valuable API.

When I define the Amazon EC2 API using the OpenAPI specification I use the following:

swagger: '2.0'
info:
title: Amazon EC2
host: ec2.amazonaws.com
paths:
/:
     get:
          summary: The Amazon EC2 service
          operationId: ec2API
     parameters:
          - in: query
            name: action

The AWS API design pattern doesn't lend itself to reuse when it comes to documentation and storytelling, but I'm always looking for an opportunity to push the boundaries, and I'm able to better outline all available actions, as individual API paths by appending the action parameter to the path:

swagger: '2.0'
info:
title: Amazon EC2
host: ec2.amazonaws.com
paths:
/?Action=RunInstances/:
     get:
          summary: Run a new Amazon EC2 instance
          operationId: runInstance

Now I'm able to describe all 228 actions you can take with the single Amazon EC2 API path as separate paths in any OpenAPI generated API documentation and tooling. I can give them unique summaries, descriptions, and operationId. OpenAPI allows me to describe what is possible with an API, going well beyond what the API provider was able to define. I've been using this approach to better quantify the surface area of APIs like Amazon, Flickr, and others who use this pattern for a while now, but as I was looking to update my work, I wanted to take this concept even further.

While appending query parameters to the path definition has allowed me to expand how I describe the surface area of an API using OpenAPI, I'd rather keep these parameters defined properly using the OpenAPI specification, and define an alternative way to make the path unique. To do this, I am exploring the usage of #bookmarks, to help make duplicate API paths more unqiue in the eyes of the schema validators, but invisible to the server side of things--something like this:

swagger: '2.0'
info:
title: Amazon EC2
host: ec2.amazonaws.com
paths:
/#RunInstance/:
     get:
          summary: Run a new Amazon EC2 instance
          operationId: runInstance
  parameters:
     - in: query
               name: action
               default: RunInstances 

I am considering how we can further make the path unique, by predefining other parameters using default or enum:

swagger: '2.0'
info:
title: Amazon EC2
host: ec2.amazonaws.com
paths:
/#RunWebSiteInstance/:
     get:
          summary: Run a new Amazon EC2 website instance
          description: The ability to launch a new website running on its own Amazon EC2 instance, from a predefined AWS AMI. 
          operationId: runWebServerInstance
  parameters:
     - in: query
               name: action
               default: RunInstances
 
     - in: query
               name: ImageId
               default: ami-60a54009
 

I am still drawing in the lines of what the API provider has given me, but I'm now augmenting with a better summary and description of what is possible using OpenAPI, which can now be reflected in documentation and other tooling that is OpenAPI compliant. I can even prepopulate the default values, or available options using enum settings, tailoring to my team, company, or other specific needs. Taking an existing API definition beyond its provider interpretation of what it does, and getting to work on being more creative around what is possible.

Let me know how incoherent this is. I can't tell sometimes. Maybe I need more examples of this in action. I feel like it might be a big piece of the puzzle that has been missing for me regarding how we tell stories about what is possible with APIs. When it comes to API definitions, documentation, and discovery I feel like we are chained to a provider's definition of what is possible, when in reality this shouldn't be what drives the conversation. There should be definitions, documentation, and discovery documents created by API providers that help articulate what an API does, but more importantly, there should be a wealth of definitions, documentation, and discovery documents created by API consumers that help articulate what is possible. 


SDK Generation, API Validation And Transformation Using The APIMATIC CLI

Continuing a growing number of command line interfaces (CLI) being deployed side by side with APIs, SDK generation provider APIMATIC just released a CLI for their platform, continuing their march towards being a continuous integration provider. There was a great post the other day on Nordic APIs about CLI, highlighting one way API providers seem to be investing in CLIs to help increase the chances that their services will get baked into applications and system integrations.

"APIMatic CLI is a command line tool written in Python which serves as a wrapper over our own Python SDK. It is available in the form of a small windows executable so you can easily plug it into your build cycle. You no longer have to write your own code or set up a development environment for the consumption of our APIs."

SDK generation, API validation and API transformation baked into your workflow, all driven by API definitions available via any URL. This is a pretty important layer of your API lifecycle, something that isn't easily done if you are still battling the monolith system, but when you've gone full microservices, DevOps, Continous Integration Kung Fu (tm), it provides you with a pretty easy way to define, deploy, and validate API endpoints, as well as the system integrations that consume those APIs--all driven and choreographed by your API definitions.

I'm still very API-centric in my workflows and use the APIMATIC API to generate SDKs, and API Transformer to translate API definitions from one format to the other, but I understand that a CLI is more practical for the way some teams operate. API service providers seem to be getting the message and responding to developers with a fully functional CLI, as well as API like Zapier did the other day with the release of their CLI--pushing the boundaries of what is a continuous integration platform as a service.

I asked Adeel of APIMATIC when they would have a CLI generator based upon API definitions. I mean their CLI is a tool that wraps the APIMATIC SDK, which I assume is generated by APIMATIC, using an API definition. Why can't they just autogenerate a CLI for their customers using the same API definition being used to generated the SDK? He responded with a smile. ;-)

Disclosure: I am an advisor to APIMATIC.


API Definitions Should Be Done By The API Provider

I talk to a lot of API service and tooling providers about API definitions. I've long been an advocate for API service providers supporting OpenAPI, as well as a variety of API definition formats--if you are having trouble doing this, check out API Transformer. While service providers are an important link in the API definition chain, support of API specification by API providers themselves, and the availability of definitions for all of their APIs is another very critical link in this API supply chain.

During a discussion with an iPaaS provider this week about the availability of OpenAPI definitions, a comment was made about there not being enough good sources of usable definitions, specifically from API providers themselves. While it is true, and there is not as much adoption by leading API providers as I would like to see, it still is pretty easy to find numerous proactive efforts by APIs providers like SendGrid, NY Times, and Azure-- just to name a few. Of course, I want ALL API providers to maintain an accurate, comprehensive set of API definitions for the operations, but I don't think is the reality we live in, or even where all API definition creation should occur.

It is part of the API DNA for the lion share of the innovation to come from external sources. Sure, I think every API provider would be better off if they maintained their own OpenAPI, generating documentation, code, and other resources. But, I also think it is perfectly acceptable for the community to do some of this heavy lifting. Not all API providers are going to have the skills, time, and other resources to make this happen--often times this is precisely why they are doing APIs, to outsource, and share the innovation load.

I see API definitions and discovery as a community thing, something API providers, consumers, as well as API service providers should all be contributing to. No matter where you operate in the space, you should be sharing your API definitions using Github, even if they are for the 3rd party API providers you depend on and might seem duplicative. You never know when your definition might have a piece of the puzzle, another developer is looking for, allowing them to build on top of your existing work, and vice versa. When you define your APIs out in the open like this, we all win, because API definitions benefit the API provider, consumer, services, and tooling developers.


Separating The Licensing Layers Of Your Valuable Data Using APIs

Data is power. If you have valuable data, people want it. While this is the current way of doing things on the Internet, it really isn't a new concept. The data in databases has always been wielded alongside business and political objectives. I have worked professionally as a database engineer for 30 years this year, with my first job building COBOL databases for use in schools across the State of Oregon in 1987, and have seen many different ways that data is the fuel for the engines of power.

Data is valuable. We put a lot of work into acquiring, creating, normalizing, updating, and maintaining our data. However, this value only goes so far if we keep it siloed, and isolated. We have to be able to open up our data to other stakeholders, partners, or possibly even the public. This is where modern approaches to APIs can help us in some meaningful ways, allowing data stewards to sensibly, and securely open up access to valuable API resources using low-cost web technology. One of the most common obstacles I see impeding companies, organizations, institutions, and agencies from achieving API success, center around restrictive views on data licensing, not being able to separate the data layers by using APIs, and being overly concerned about a loss of power when you publish APIs.

You worked hard to develop the data you have, but you also want to make accessible. To protect our interests I see many folks impose pretty proprietary restrictions around their data, which ends up hurting its usage and viability in partner systems, and introducing friction when it comes to accessing and putting data to work--when this is the thing you really want as a data steward. Let me take a stab at helping you reduce this friction by better understanding how APIs can help you peel the licensing onion layers back when it comes to your valuable data.

Your Valuable Data
This is an example point of contact record. I've worked hard to create this bit of data (not really), and maintain a relationship with this point of contact. It takes time to validate that their record is up to date, always relfecting reality in my database.

While openly licensed data is one important piece of the puzzle, and data should be openly licensed when it makes sense, this is the layer of this discussion you may want to be a little more controlling in who has access to, and how partners and the public are able to put your data to use in their operations.

In an online, always on, digital environment, you want data accessible, but to be able to do this you need to think critically about how you peel back the licensing onion when it comes to this data.

The Schema For Your Data
The first layer to peel back when you are looking to make data more accessible with APIs is at the schema level. This is the names, description, data type, and other details about the meta layer of your valuable data--it isn't the data, but the description of the structure of your data.

Ideally, your schema already employs predefined schemas like we find at Schema.org, or Open Referral. Following common definitions will significantly widen the audience for any dataset, allowing data to seamlessly be used across a variety of systems. These forms of schema or openly licensed, usually putting into the public domain.

The schema layer of open data can often resemble the data itself, using machine readable formats like XML, JSON, and YAML. This is most likely the biggest contributing factor for data stewards failing to see this as a separate layer of access from the data itself, and sometimes applying a restrictive license, or forgetting to license it at all.

Data is often more ephemeral than the schema. Ideally, schemas do not change often, are shared and resused, as well as free from restrictive licensing. For system integrations to work, and for partnerships to be sustainable, we need to speak a common language, and schema is how we describe our data so it can be put to use outside our firewall.

Make sure the schema for your data is well-defined, machine readable, and openly licensed for the widest possible use.

Defining Access To Data Using OpenAPI 
The third layer of this licensing onion is the API layer, which governs access to data, defining how requests are made upon data, and how responses are structured. Many API providers are putting OpenAPI to work to define this layer of data operations.

As with the schema layer of data operations, you are hoping that other companies, organizations, institutions, and government agencies will integrate this layer into their operations. This layer is much more permanent than the ephermeral data layer, and should be well defined, ideally sharing common patterns, and free from restrictive licensing.

Per the Oracle v Google Java API copyright case in the United States, the API layer is subject to copyright enforcement, meaning the naming, ordering of the surface area of your API can be copyright. If you are looking for others to comfortably integrate this definition into their operations, it should be openly licensed. 

The API layer is not your data. It defines how data will be accessed, and put to use. It is important to separate this layer of your data operations, allowing it to shared, reused, and implemented in many different ways supporting web, mobile, voice, bot, and a growing number of API driven applications.

Make sure the API layer to data operations is well-defined, machine readable, and free from restrictive licensing when possible. 

 

Currently, many data providers I talk to see this all as a single entity. It's our data. It's valuable. We need to protect it. Even at the same time they really want it simultaneously put to work in other systems, by partners, or even the public. Because they cannot separate the layers, and understand the need for separate licensing considerations, they end up being very closed with the data, schema, and API layers of their operations--introducing friciton at all steps of the application and data life cycle.

Modern approaches to API management and logging at the API layer is how savvy data stewards are simultaenoulsy opening up access, and maintaing control over data, while also increasing awareness around how data is being put to use, or not used. Key-based access, rate limits, access plans, are all approaches to opening up access to data, while maximizing control, and maintaining a desired balance of power between steward, partners, and consumers. In this model, your schema and API definition needs to be open, accessible, and shareable, where the data itself can be much more tightly controlled, depending on the goals of the data steward, and the needs of consumers.

Let me know if you want to talk through the separation of these layers of licensing and access around your data resources. I'm all for helping you protect your valuable data, but in a pragrmatic way. If you want to be successful in partnering with other stakeholders, you need to be thinking more critically about separating the layers of your data operations, and getting better at peeling back the onion of your data operations--something that seems to leave many folks with tears in their eyes.


YAML Templates For API Operations

I am seeing a significant number of infrastructure orchestration solutions in the cloud start using YAML templates as the core setting of settings and instructions for workflows. Amazon recently introduced YAML templates for your AWS CloudFormations, where you can define the infrastructure templates you are using throughout the API life cycle. These AWS YAML templates are fast becoming the central definition to be used across AWS operations, with support in the AWS Service Catalog.

Whether you use AWS or not, working to define your infrastructure using YAML templates help define what is going on. I'm seeing significant adoption of OpenAPIs in YAML, and I'm even beginning to create API operational indexes using APIs.json converted into YAML (there is a naming lesson in there). I also have YAML templates for each area of my API lifecycle research, providing me machine readable definitions for everything from news to patents, that I then use across my API research and storytelling.

I feel the same way about YAML as I did about JSON a decade ago, and it is quickly becoming my preferred format for any structured data, no matter where it is used in my operations. YAML + Github is quickly becoming the engine for some interesting ways of delivering infrastructure, and specifically API infrastructure, in a consistent way that is easy to communicate with others. This example focuses on AWS usage of YAML, but it is something I'm seeing from Google and other tech giants. I think the readability of YAML (minus quotes and brackets) makes it accessible to a wider audience beyond developer groups, something that is going to be critical to doing APIs at scale.


Tooling For Converting Your OpenAPI Definitions From 2.0 to 3.0

I wrote a post asking what it would take to migrate OpenAPI tooling from version 2.0 to 3.0 of the API specification, and Mike Ralphson (@PermittedSoc) commented about some of the projects he's been working on involving the latest specification version. Which I hope is a good sign of things to come, when it comes to moving from version 2.0 to 3.0 in 2017.

Mike has developed an OpenAPI converter and validator to help people migrate their OpenAPI definitions from 2.0 to 3.0. The open source tool also has an online version of the OpenAPI converter and validator for using in the browser, and of course, it also has an OpenAPI conversion and validation API, because ALL API tools and services should have an API--it is just good API craft.

It makes sense that some of the first tools to emerge are for conversion. Many API developers will need to convert their existing API definitions into version 3.0 of the specification to begin learning about what is new with OpenAPI 3.0. If you have examples of OpenAPI 3.0 definitions for your API, please publish them to Github and share with me, so I can help point folks to examples in the wild that they can learn from as we make this shift.


Playing With Different Views For An OpenAPI Diff Tool

I am working on version 1.1 of the API definition for the human services data specification (HSDS), and I needed some help articulating the differences between version 1.0 and 1.1, which are both defined using the OpenAPI 2.0 specification. I manage all of my OpenAPIs using Github, but I needed a friendlier way to show the diff between two JSON files, than what Github provides. I got to work on a version that would run using Liquid, that would work in Jekyll, which all my sites and tools run in.

I have a variety of API documentation tools that run on Github, so I wanted to develop an interface that showed two separate OpenAPI definitions side by side on a simple HTML page, so at this stage, I'm playing with different ways of showing the differences between paths, and other elements of the API definition. I'm not entirely happy with what I have, but I started applying a red DIFF label to any path that isn't represented in the previous API definition. It works well for helping me see which API endpoints have been added or changed in the latest version.

Currently, I am just looking to understand the differences in paths available, but I will be diff for headers, parameters, and other elements of the API definition. I'm worried about things getting too cluttered with bigger API definitions. I'm trying to keep things fast loading, and something I can work with non-developers on, so I want to be thoughtful about what I add, and how I add it to the layout. I'm looking to get a group of business users up to speed on where things are going with the spec, and encourage them to get more involved with future versions--so not scaring them off is an important part of this conversation.

I am finding Jekyll, Liquid, and HTML, with a little JavaScript when necessary to be a perfect medium for developing OpenAPI tooling on top of. It provides a simple, static, and a flexible way to craft API tooling, that can be forked by anyone. As my proficiency with Liquid evolves, I'm getting better at making it work with OpenAPI YAML which is mounted via Jekyll. With everything operating on Github, version control and API access to all my files are adding a valuable layer to my work. Now that I have several of these types of tools available, also I need to get more organized about how I'm evolving the code and maintaining a catalog of these offerings so that others can put to use in their API operations.


Open Source Drag And Drop API Lifecycle Design Tooling

I'm always on the hunt for new ways to define, design, deploy, and manage API infrastructure, and thought the AWS Cloud Formation Designer provides a nice look at where things might be headed. AWS CloudFormation Designer (Designer) is a graphic tool for creating, viewing, and modifying AWS CloudFormation templates, which translates pretty nicely to managing your API infrastructure as well.

While the AWS Cloud Formation Designer spans all AWS services, all the elements are there for managing all the core stops along the API life cycle liked definition, design, DNS, deployment, management, monitoring, and others. Each of the Amazon services is available with a listing of each element available for the service, complete with all the inputs and outputs as connectors on the icons. Since all the AWS services are APIs, it's basically a drag and drop interface for mapping out how you use these APIs to define, design, deploy and manage your API infrastructure.

Using the design tool you can create templates for governing the deployment and management of API infrastructure by your team, partners, and other customers. This approach to defining the API life cycle is the closest I've seen to what stimulated my API subway map work, which became the subject of my keynotes at APIStrat in Austin, TX. It allows API architects and providers to templatize their approaches to delivering API infrastructure, in a way that is plug and play, and evolvable using the underlying JSON or YAML templates--right alongside the OpenAPI templates, we are crafting for each individual API.

The AWS Cloud Formation Designer is a drag and drop UI for the entire AWS API stack. It is something that could easily be applied to Google's API stack, Microsoft, or any other stack you define--something that could easily be done using APIs.json, developing another layer of templating for which resource types are available in the designer, as well as the formation templates generated by the design tool itself. There should be an open source "API formation designer" available, that could span cloud providers, allowing architects to define which resources are available in their toolbox--that anyone could fork and run in their environment.

I like where AWS is headed with their Cloud Formation Designer. It's another approach to providing full lifecycle tooling for use in the API space. It almost reminds me of Yahoo Pipes for the AWS Cloud, which triggers awkward feels for me. I'm hoping it is a glimpse of what's to come, and someone steps up with an even more attractive drag and drop version, that helps folks work with API-driven infrastructure no matter where it runs--maybe Google will get to work on something. They seem to be real big on supporting infrastructure that runs in any cloud environment. *wink wink*


What Questions Would You Ask Across 50K API Definitions?

Mike Ralphson‏ (@PermittedSoc) asked me the other day, "if you could run SQL / #GraphQL queries over nearly 50K #API definitions, what would you ask?". I told him I would respond via blog post, which is one way I help amplify the conversation I have with other API folks in the space. Mike is doing som important work when it comes to API discovery, something that needs amplification if we are going to move this conversation forward.

Ok, so what would I ask of nearly 50K API definitions, if I had the opportunity to ask? Here are some of my answers:

  • What are all the paths used? - I'd like to see a list of all path folders, separated by the forward slash, minus any parameters.
  • What paths folders actually are words? - I'd like to know how coherent API design patterns are, and how many are actually words in a dictionary. 
  • What are all the tags applied? - A listing of all tags applied to APIs, with counts for each time it is used.
  • What are all the parameters? - A listing of all the parameters applied to APIs, with counts for each used.
  • What are all the definitions - A listing of all definitions used as part of API requests and responses, with counts next to each.
  • How many don't have definitions? What percentage of the API definitions do not have definitions describing their responses
  • What businesses are involved? - A listing of all companies involved with API definitions from contact information to domain ownership.
  • What is whois information behind each domain(s)? - Pull whois information for all the domains involved in API definitions.
  • Which definitions do not have path summary or description? - Which definitions have omitted the description for the path.
  • What's the average length of path summary and descriptions? - Of the definitions with a description or summary, what is the average length?
  • How many APIs provide a link to terms of service? - Checking to see if a term of service is part of the definition.
  • How many APIs provide contact information for an API? - Checking to see if contact information is part of the definition.
  • How many APIs describe their headers? - Looking for specific headers described as part of each path definition.
  • How many APIs use the body as part of their request? - Looking for heavy body use as part of the request surface area of an API.
  • How many APIs have a security definition? - Which of the APIs has provided a definition for how an API is secured?
  • How many APIs do not use response codes? - Which of the APIs do not provide response codes for their API responses?
  • Which API response HTTP status codes used? - Provide a list of API response HTTP status codes used, with counts for each.

That is a starting list of what I'd like to ask of 50K API definitions. It is something I'd have to think and write about more to be able to come up with more creative questions. I recommend publishing them all to a Github repository and let people start asking questions via an interface. You might not be able to answer all of the questions, but it would be interesting to see what people want to know--I am sure you could develop a pretty interesting look at how folks see API discovery, and what they are looking for.

API discovery is one of the areas of the API life cycle that is pretty deficient due to how people view their APIs and how they often have their blinders on regarding the wider API community. Most folks are thinking about their APIs, and maybe a handful of other APIs they are familiar with but do not consider API usage across industries, or the entire space. We need more work like this to occur. We need more API definitions to be made available, as well as more dynamic ways for folks to discover, understand and learn about APIs that already exist.


OpenAPI As An API Literacy Tool

I've been an advocate for OpenAPI since it's release, writing hundreds of stories about what is possible. I do not support OpenAPI because I think it is the perfect solution, I support it because I think it is the scaffolding for a bridge that will get us closer to a suitable solution for the world we have. I'm always studying how people see OpenAPI, both positive and negative, in hopes of better crafting examples of it being used, and stories about what is possible with the specification.

When you ask people what OpenAPI is for, the most common answer is documentation. The second most common answer is for generating code and SDKs. People often associate documentation and code generation with OpenAPI because these were the first two tools that were developed on top of the API specification. I do not see much pushback from the folks who don't like OpenAPI when it comes to documentation, but when it comes to generating code, I regularly see folks criticizing the concept of generating code using OpenAPI definitions.

When I think about OpenAPI I think about a life cycle full of tools and services that can be delivered, with documentation and code generation being just two of them. I feel it is shortsighted to dismiss OpenAPI because it falls short in any single stop along the API lifecycle as I feel the most important role for OpenAPI is when it comes to API literacy--helping us craft, share, and teach API best practices.

OpenAPI, API Blueprint, and other API definition formats are the best way we currently have to define, share, and articulate APIs in a single document. Sure, a well-designed hypermedia API allows you to navigate, explore, and understand the surface area of an API, but how do you summarize that in a shareable and collaborative document that can be also used for documentation, monitoring, testing, and other stops along the API life cycle. 

I wish everybody could read the latest API design book and absorb the latest concepts for building the best API possible. Some folks learn this way, but in my experience, a significant segment of the community learn from seeing examples of API best practices in action. API definition formats allow us to describe the moving parts of an API request and response, which then provides an example that other API developers can follow when crafting their own APIs. I find that many people simply follow the API examples they are familiar with, and OpenAPI allows us to easily craft and shares these examples for them to follow.

If we are going to do APIs at the scale we need to help folks craft RESTful web APIs, as well as hypermedia, GraphQL, and gRPC APIs. We need a way to define our APIs, and articulate this definition to our consumers, as well as to other API providers. This helps me remember to not get hung up on any single use of OpenAPI, and other API specification formats, because first and foremost, these formats help us with API literacy, which has wider implications than any single API implementation, or stops along the API life cycle.


Taking A Look At The Stoplight API Spec Editor

I'm keeping an eye on the different approaches by API service providers when it comes to providing API editors within their services and tooling. While I wish there was an open source GUI API editor out there, the closest thing we have is from these API service providers, and I am trying to track on what the best practices are so that when someone does step up and begin working on an open, embeddable solution, they can learn from my stories about what is working or not working across the space.

One example I think has characteristics that should be emulated is the API Spec Editor from Stoplight. The GUI editor lets you manage all the core elements of an OpenAPI like the general info, host, paths, and even the shared responses and parameters. They even provide what they call a CRUD builder where you paste the JSON schema, and they'll generate the common paths you will need to create, read, update, and delete your resources. Along the way you can also make calls to API endpoints using their interactive interface, helping ensure your API definition is actually in alignment with your API.

The Stoplight API Spec Editor bridges the process of defining your OpenAPI for your operations, with actually documenting and engaging with an API through an interactive client interface. I like this approach to coming at API design from multiple directions. Apiary first taught us that API definitions were about more than just documentation, and I think our API editors should keeping evolving on this concept, and allowing us to engage with any stops along the API life cycle like we are seeing from API service providers like Restlet.

I'm already keeping an eye on Restlet and APIMATIC's approach to providing a GUI API design editor within their solutions and will keep an eye out on other providers as I have time. Like other areas of the API sector, I'm hoping I can develop a list of best practices that any service provider can follow when developing their tools and services.


Learning To Use Our Words Better When Defining Our APIs

I am playing around with the Open API for the Oxford Dictionaries API, and I'm struck by the importance of not just dictionaries like the Oxford Dictionaries, but also the importance of OpenAPI, and API providers defining their APIs like the Oxford folks have. While we aren't as far down the road as we are with the English dictionary, we are beginning to make progress when it comes to defining the paths, parameters, and other characteristics using OpenAPI, learning to speak and communicate in the digital world using APIs.

We use words to craft titles, paragraphs, outlines, and other ways that we communicatie in our personal and professional lives. We also use words to craft titles, paragraphs, outlines, collections, and other ways our systems our communicating in our personal and professional lives using the OpenAPI specification. In both these forms of communicating we are always trying to find just the right words, or series and orders of words to get across exactly the meaning we are looking for--we just have centuries of doing this when it comes writing and speaking, and only a decade or so of doing this with defining our digital resources using APIs.

Eventually, I'd like to see entire dictionaries of JSON Schema, ALPS, or other machine readable specification, available by industry, and topic. The way we craft our API definitions and design our APIs often feels like we have barely learned to speak, let alone read or write. I'd like to see more reuse of common dictionaries already in use by leading API providers, and I'd like to see us get more thoughtful in how we express ourselves via our API definitions. The most successful APIs I find out there don't just provide a machine readable interface, they provide an intuitive interface that makes sense to humans, while also being machine readable for use in other systems.

It feels like to me that we should integrating the Oxford Dictionaries API into our API design tooling, letting us suggest, autocomplete, and discover better ways to articulate the meaning behind our APIs. API design editors could use the Oxford Dictionaries API to help developers attach more precise meaning to the names of paths, parameters, and other aspects of defining our APIs, much like word processors have done for the last couple of decades. Most APIs I come across do not have any sort of coherent name, ordering, or structure, and the few that have deployed an OpenAPI or other machine readable format, often feel like cave writing, and lack any coherent structure, purpose, or meaning--we have a long, long way to go before our systems learn to communicate even at a 1st grade level.


Helping Your Customers Operate Throughout The API LIfe Cycle

When I started API Evangelist back in 2010 the only stop along the API life cycle that service providers were talking about was API management. In 2017, there are numerous stops along the API life cycle from design, to testing, all the way to deprecation. The leading API providers are expanding the number of stops they service, and the smart ones are making sure that if they only service on or two stops, they do so by providing via API definitions like OpenAPI, ensuring their customers are able to seamlessly weave multiple service providers together to address their full life cycle of needs.

I've been working with my partner Restlet to advise them on expanding their platform to be what I consider to be an API life cycle provider. When I first was introduced to Restlet they were the original open source enterprise grade API deployment framework. Then Restlet became a cloud API deployment and management provider, and with their acquisition of DHC they also became an API client, and testing provider. Now with their latest update, they have worked hard to help their developer and business customers service almost every stop along a modern API life cycle, from design to deprecation.

While Restlet is developing tooling to help companies define what the API life cycle means to them, the heartbeat of what Restlet delivers centers around API definitions like OpenAPI and RAML. API definitions provide the framework when your designing, deploying, documenting, managing, and testing your APIs using Restlet. They also provide the ability for you to get your API definitions in and out of the platform, and load them into potentially other API services, allow API operators to get what they need done. In my opinion, making API definitions just as importan tas any other service or tooling you offer along the API life cycle.

Serving a single or handful of stops along the API life cycle can be today's version of vendor lockin. If your customers cannot easily load their API definitions in and out of your system you are locking them in, and while they may stay with you for a while, eventually they will need additional services, and the extra work it takes to keep in sync with your platform will increase, and eventually it won't be worth staying. I'm a big fan of companies doing one thing and doing it well, servicing single stops along the API life cycle, but after watching companies come and go for the last seven years, the one's that don't support API definitions won't be around too long.


API Definition: API Transformer

This is an article from the current edition of the API Evangelist industry guide to API definitions. The guide is designed to be a summary of the world of API definitions, providing the reader with a recent summary of the variety of specifications that are defining the technology behind almost every part of our digital world

OpenAPI Spec is currently the most used API definition format out there, with the number of implementations, and tooling, with API Blueprint, Postman Collections, and other formats trailing behind. It can make sense to support a single API definition when it comes to an individual platforms operations, but when it comes to interoperability with other systems it is important to be multi-lingual and support multiple of the top machine-readable formats out there today.

In my monitoring of the API sector, one service provider has stood out when it comes to being a truly multi-lingual API definition service provider--the SDK generation provider, APIMATIC. The company made API definitions the heart of its operations, generating what they call development experience (DX) kits, from a central API definition uploaded by users--supporting OpenAPI Spec, API Blueprint, Postman Collections, and other top formats. The approach has allowed the company to quickly expand into new areas like documentation, testing, continuous integration, as well as opening up their API definition translation as a separate service called API Transformer.

API Transformer allows anyone to input an API Blueprint, Swagger, WADL, WSDL, Google Discovery, RAML 0.8 - 1.0, I/O Docs - Mashery, HAR 1.2, Postman Collection 1.0 - 2.0, Mashape, or APIMATIC Format API definition and then translate and export in a  API Blueprint, Swagger 1.0 - 1.2, Swagger 2.0 JSON, Swagger 2.0 YAML, WADL - W3C 2009, RAML 0.8 - 1.0, Postman Collection 1.0 - 2.0, and their own APIMATIC Format. You can execute API definition translations through their interface or seamlessly integrate with the API Transformer API definition conversation API.

There is no reason that an API provider and API service providers shouldn’t be multi-lingual. It is fine to adopt a single API definition as part of your own API operations, but when it comes to working with external groups, there is no excuse for not being able to work with any of the top API definition formats. The translation of API definitions will increasingly be essential to doing business throughout the API life cycle, requiring each company to have an API definition translation engine baked into their continuous integration workflow, transforming how they do business and build software.


 If you have a product, service, or story you think should be in the API Evangelist industry guide to API design you can email me , or you can submit a Github issue for my API definition research, and I will consider adding your suggestion to the road map.


API Definition: U.S. Data Federation

This is an article from the current edition of the API Evangelist industry guide to API definitions. The guide is designed to be a summary of the world of API definitions, providing the reader with a recent summary of the variety of specifications that are defining the technology behind almost every part of our digital world.

The U.S. Data Federation is a federal government effort to facilitate data interoperability and harmonization across federal, state, and local government agencies by highlighting common data formats, API specifications, and metadata vocabularies. The project is focusing on being a coordinating interoperability across government agencies by showcasing and supporting use cases that demonstrate unified and coherent data architectures across disparate agencies, institutions, and organizations.

The project is designed to shine a light on “emerging data standards and API initiatives across all levels of government, convey the level of maturity for each effort, and facilitate greater participation by government agencies”--definitely in alignment with the goal of this guide. There are currently seven projects profiled as part of the U.S. Data Federation, including Building & Land Development Specification, National Information Exchange Model, Open Referral, Open311, Project Open Data, Schema.org, and the Voting Information Project.

By providing a single location for agencies to find common schema documentation tools, schema validation tools, and automated data aggregation and normalization capabilities, the project is hoping to incentivize and stimulate reusability and interoperability across public data and API implementations. Government agencies of all shapes and sizes can use the common blueprints available in the U.S. Data Federation to reduce costs, speed up the implementation of projects, while also opening them up for augmenting and extending using their APIs, and common schema.

It is unclear what resources the U.S. Data Federation will have available in the current administration, but it looks like the project is just getting going, and intends to add more specifications as they are identified. The model reflects an approach that should be federated and evangelized at all levels of government, but also provides a blueprint that could be applied in other sectors like healthcare, education, and beyond. Aggregating common data formats, API specifications, metadata vocabularies, and authentication scopes will prove to be critical to the success of the overall climate of almost any industry doing business on the web in 2017.


 If you have a product, service, or story you think should be in the API Evangelist industry guide to API design you can email me , or you can submit a Github issue for my API definition research, and I will consider adding your suggestion to the road map.


API Definition: WebConcepts.info

This is an article from the current edition of the API Evangelist industry guide to API definitions. The guide is designed to be a summary of the world of API definitions, providing the reader with a recent summary of the variety of specifications that are defining the technology behind almost every part of our digital world.

Keeping up with the standards bodies like International Organization for Standardization (ISO) and Internet Engineering Task Force (IETF)  can be a full-time job. Thankfully,  Erik Wilde (@dret) has help simply and made the concepts and specifications that make the web work more accessible and easier to understand, with his WebConcepts.info project.

According to Erik, “the Web’s Uniform Interface is based on a large and growing set of specifications. These specifications establish the shared concepts that providers and consumers of Web services can rely on. Web Concepts is providing an overview of these concepts and of the specifications defining them.” His work is a natural fit for what I am trying to accomplish with my API definition industry guide, as well as supporting other areas of my research.

One of the areas that slows API adoption is a lack of awareness of the concepts and specifications that make the web work among developers who are providing and consuming APIs. The modern API leverages the same technology that drives the web--this is why it is working so well. The web is delivering HTML for humans, and APIs are using the same to deliver machine-readable data, content, and access to algorithms online. If a developer is not familiar with the fundamental building blocks of the web, the APIs they provide, and the applications they build on top of APIs will always be deficient.

This project provides an overview of 28 web concepts with 643 distinct implementations  aggregated across five separate organizations including the International Organization for Standardization (ISO), Internet Engineering Task Force (IETF), Java Community Process (JCP), Organization for the Advancement of Structured Information Standards (OASIS), and the World Wide Web Consortium (W3C)--who all contribute to what we know as the web.  An awareness and literacy around the 28 concepts aggregated by Web Concepts is essential for any API developer and architect looking to fully leverage the power of the web as part of their API work.

After aggregating the 28 web concepts from the five standards organization, Web Concepts additionally aggregates 218 actual specifications that API developers, architects, and consumers should be considering when putting APIs to work. Some of these specifications are included as part of this API Definition guide, and I will be working to add additional specifications in future editions of this guide, as it makes sense. The goal of this guide is to help bring awareness, literacy, and proficiency with common API and data patterns, making use of the work Web Concepts project, and building on the web literacy work already delivered by Erik, just makes sense.

Web Concepts is published as a Github repository, leveraging Github Pages for the website. He has worked hard to make the concepts and specification available as JSON feeds, providing a machine-readable feed that can be integrated into existing API design, deployment, and management applications--providing web literacy concepts and specifications throughout the API life cycle.  All JSON data is generated from the source data, which is managed as a set of XML descriptions of specifications, with the build process based upon XSLT and Jekyll, providing multiple ways to approach all concepts and specifications, while maintaining the relationship and structure of all the moving parts that make up the web.

When it comes to the startup space, the concepts that make up the web, and the specifications that make it all work, might seem a little boring, something only the older engineers pay attention to.  Web Concepts helps soften, and make these critical concepts and specifications accessible and digestible for a new generation of web and API developers--think of them as gummy versions of vitamins. If we are going to standardize how APIs are designed, deployed, and managed--making all of this much more usable, scalable, and interoperable, we are going to have to all get on the same page (aka the web).

Web Concepts is an open source project, and Erik encourages feedback on the concepts and specifications. I encourage you to spend time on the site regularly and see where you can integrate the JSON feeds into your systems, services, and tooling. We have a lot of work ahead of us to make sure the next generation of programmers have the base amount of web literacy necessary to keep the web strong and healthy. There are two main ways to contribute to the building blocks of the web: participate as a contributor to the standards body, or you can make sure you are implementing common concepts and specifications throughout your work, contributing to the web, and not just walled gardens and closed platforms.


 If you have a product, service, or story you think should be in the API Evangelist industry guide to API design you can email me , or you can submit a Github issue for my API definition research, and I will consider adding your suggestion to the road map.


API Definition: Open API Specification 3.0.0-RC0

This is an article from the current edition of the API Evangelist industry guide to API definitions. The guide is designed to be a summary of the world of API definitions, providing the reader with a recent summary of the variety of specifications that are defining the technology behind almost every part of our digital world.

The OpenAPI Specification, formerly known as Swagger is approaching an important milestone, version 3.0 of the specification, but it is also the first major release since the specification was entered into the Linux Foundation. Swagger was the creation of Tony Tam of Wordnik back in 2010, but after the project was acquired by SmartBear in 2015, the company donated the specification to the newly formed OpenAPI Initiative (OAI) which is part of the Linux Foundation. Swagger has now been reborn as the OpenAPI Specification, and in early 2017 is approaching its first major release under the guidance of a diverse group of governing members.

Version 3.0 of the API specification format has taken a much more modular, and reusable approach to defining the surface area of an API, enabling more power and versatility when it comes to describing the request and response models, as well as providing details on the common components that make up API usage like the underlying data schema and security definitions. There are numerous changes to the API specification, but there are just a handful that will have a significant impact across every stop along the API life cycle where API definitions are making an impact.

Hosts

When describing your API, you can now provide multiple hosts, allowing you to better deal with the complexity of how APIs might be located in a single location, or spread across multiple cloud location, and global infrastructure.

Content Negotiation

You can now provide content objects to define the relationship between response objects, media types, and schema, opening up the negotiation of exactly the type of content needed, encouraging the design of multiple rich dimensions for any single API resource.

Body

The latest version of the specification plays catch-up when it comes to allowing the body of a request and response to be defined separately from the request parameters, allowing for more control over the payload of any API calls.

Schema

There is an increased investment in JSON Schema, including the support of `oneOf`, `anyOf` and `not` functions, allowing for alternative schema, as well as the standard JSON schema definition included.

Components

The new components architecture really reflects APIs, making everything very modular, reusable, and much more coherent and functional. The new version encourages good schema and component reuse, helping further bringing the definition into focus.

Linking

While not quite full hypermedia, version 3.0 of the OpenAPI Spec supports linking, allowing for the description of relationships between paths, giving a nod towards hypermedia design pattern, making this version the most resilient so far.

Webhooks

Like the nod towards hypermedia, the specification now allows for the inclusion of callbacks that can be attached to a subscription operation describing an outbound operation--providing much needed webhook descriptions as part of API operations, making it much more real time and event driven.

Examples

The new version enables API architects to better describe, and provide examples of APIs responses and requests, helping make API integration a learning experience, by providing examples for use beyond just documentation descriptions.

Cookies

Responding to a large number of API providers, and the reality on the ground for many implementations, the new version allows for the definition of cookies as part of API requests.

These nine areas represent the most significant changes to the OpenAPI Spec 3.0. The most notable shift is the componentized, modular, reusable aspect of the specification. After that, it is the content negotiation, linking, web hooks, and other changes that are moving the needle. It is clear that the 3.0 version of the specification has considered the design patterns across a large number of implementations, providing a pretty wide-reaching specification for defining what an API does, in a world where every API can be a special snowflake.

In 2017, the OpenAPI Spec is the clear leader of the API definition formats, with the largest adoption, as well as the amount of tooling developed. While documentation and SDK generation are still the top two reasons for crafting API definitions, there are numerous other reasons for using API definitions including mocking, testing, monitoring, IDE integration, and much, much more. In 2017, the OpenAPI Spec is the clear leader of the API definitions format. It has the highest usage rate, as well as the largest number of tooling (or tools?) available.


If you have a product, service, or story you think should be in the API Evangelist industry guide to API design you can email me , or you can submit a Github issue for my API definition research, and I will consider adding your suggestion to the road map.


If you think there is a link I should have listed here feel free to tweet it at me, or submit as a Github issue. Even though I do this full time, I'm still a one person show, and I miss quite a bit, and depend on my network to help me know what is going on.