What is an API?

An API (Application Programming Interface) is a set of rules and protocols enabling software programs to communicate and share information. While the term is often associated with “web APIs,” it encompasses a broader concept. For instance, an API might allow a library to share its collection data with a museum or enable a weather service to provide updates to a news organization. APIs operate based on predefined rules set by developers, specifying how data can be accessed and used. While some APIs are public, most are internal, facilitating communication between systems within an organization.

Using an API

For those unfamiliar with computer science, APIs can be a bit abstract, so let’s try to explain it better with this example using the The Cat API. This API allows you to get information about different cat and dog breeds, with images and descriptions. The information is not available through a public interface, so if you want to see a cat image, you must use the API.

In this example, let’s get the information about the Ragdoll cat breed. To do this, we require to use this API endpoint: https://api.thecatapi.com/v1/images/XFhRpYS_D.

You can click on that link to view the information about the cat in a format similar to this:

Code
async function getCatData(catId) {
    const url = `https://api.thecatapi.com/v1/images/${catId}`;
    const response = await fetch(url);
    if (!response.ok) throw new Error(`HTTP error! status: ${response.status}`);
    const data = await response.json();
    return data;
}

catData = await getCatData("XFhRpYS_D");

viewof catDataString = {
  const pre = html`<pre style="background-color: #f8f9fa; padding: 15px; border-radius: 5px; overflow-x: auto;">`;
  const code = html`<code style="color: #333;">`;
  
  // Formatted JSON with syntax highlighting
  const formatted = JSON.stringify(catData, null, 2)
    .replace(/&/g, '&amp;')
    .replace(/</g, '&lt;')
    .replace(/>/g, '&gt;')
    .replace(/"([^"]+)":/g, '<span style="color: #a31515;">"$1"</span>:') 
    .replace(/: "([^"]+)"/g, ': <span style="color: #008000;">"$1"</span>')  
    .replace(/: ([0-9]+)/g, ': <span style="color: #0000FF;">$1</span>')  
    .replace(/\b(true|false|null)\b/g, '<span style="color: #FF0000;">$1</span>');
    
  code.innerHTML = formatted;
  pre.appendChild(code);
  return pre;
}

This format is called JSON, and its widely used in APIs to return data. We will see more about this format later, but for now, we can say that this output is called a response, and it contains keys and values. Keys are the names of the information we requested and values are the information we received. For instance, in this response, the url to the image of a Ragdoll cat can be found under the url key, and its value is https://cdn2.thecatapi.com/images/XFhRpYS_D.jpg.

JSON syntax

JSON syntax

Now, what we want to do is to get the image of the Ragdoll, not just the abstract information. To do this, we have to access the values associated with each key. The url to the image is under the url key, the name of the cat is under the breeds key and the description is under the description key. With a bit of coding, we can display the image of the cat:

Code
function displayCat(data) {
    const container = html`<div class="card" style="max-width: 300px; margin: 15px auto;">
        <img src="${data.url}" class="card-img-top" alt="Ragdoll cat">
        <div class="card-body">
            <h5 class="card-title">Breed: ${data.breeds[0].name}</h5>
            <p class="card-text">${data.breeds[0].description}</p>
        </div>
    </div>`;
    return container;
}

displayCat(catData);

Cute, isn’t it?

What the API is doing, is to get the information from the server database, return it to the client in a format that can be easily parsed by a computer, and then the browser can use this information to display the cat image, or to get additional information about the cat.

A simple representation of an API (click to zoom)

A simple representation of an API (click to zoom)

With that information, we can display the information in a more readable format, and reutilize the method to get the image of other cats, for instance, a Bengal cat:

Code
bengalCatData = await getCatData("LSaDk6OjY");
displayCat(bengalCatData);

You can even get a random cat image!

Code


Now that we’ve seen how APIs work, let’s dive into why they’re so essential.

Why Do We Need APIs?

A common question might be: why use an API instead of simply sharing the data directly?

Simplifying Access to Data

For researchers, APIs simplify access to complex datasets. Imagine working with a public archive or a library catalog where data is stored in various relational tables. Instead of manually extracting and organizing the data, an API allows you to query for specific subsets—like texts published in a particular year or metadata about historical photographs—without worrying about the database structure.

An API simplifies this process by predefining these complex queries and providing endpoints to access the data you need with minimal effort. Instead of interacting directly with the database, users make requests to the API and receive only the required data in a user-friendly format, like JSON.

Enhancing Security and Performance

Direct access to a database can compromise its security and integrity. Opening a database to the public exposes it to risks, such as unauthorized modifications or data breaches. APIs mitigate these risks by controlling what data is accessible and to whom. For instance APIs can restrict access to specific users or IP addresses. APIs can also limit the amount of data retrieved, improving performance.

For collaborative research, APIs ensure secure and consistent access to data across teams and institutions. They also enhance reproducibility by allowing other researchers to replicate data retrieval processes through well-documented endpoints.

Why Not Share Data as a Downloadable File?

Sharing data in formats like Excel or CSV files can sometimes be a practical solution. However, this approach has significant limitations, for instance, downloadable files do not automatically update when the source data changes. Users must re-download the file every time updates are made, which can lead to outdated information. Another limitation is associated with the size of the file. For large datasets, downloading and processing the entire file can be inefficient and cumbersome.

Ultimately, whether to use an API or share data as a downloadable file depends on the project’s specific needs, balancing dynamic access with ease of use. When you need to retrieve specific data subsets or interact with the data dynamically, an API is ideal. It provides granularity and ensures users access the most up-to-date information. When users need the entire dataset for analysis or offline use, offering a downloadable file might be more appropriate.

APIs for bulk data retrieval

APIs are not designed for bulk data retrieval but for accessing data in a more focused and controlled way. It bridges the gap between complex databases and the users who need specific, timely information.

APIs in Research

In research, APIs bridge the gap between data and analysis by automating data collection and streamlining workflows. A historian studying digitized newspapers, for instance, can use APIs to query articles that mention specific events, dates, or people. Social scientists can analyze real-time conversations and trends through APIs from platforms like X or YouTube, while environmental researchers can leverage satellite data APIs to monitor deforestation patterns. By enabling seamless integration across datasets, APIs foster interdisciplinary collaboration and empower scholars to tackle complex questions with innovative approaches.

In summary

APIs are not just tools for developers—they hold immense potential for researchers, too. By providing structured, dynamic access to datasets, APIs enable scholars to automate data collection, access real-time information, and integrate diverse data sources into their workflows. Whether you’re studying digital humanities, analyzing climate data, or investigating social media trends, APIs allow you to retrieve exactly the data you need, at scale and with precision. Embracing APIs as part of your research toolkit opens doors to innovative methodologies and insights that might otherwise remain out of reach.