Because GatsbyJS can query data only via GraphQL endpoints.
Refer to Querying with GraphQL.
💭 Terminologies & Concepts
Let’s make sure we are on the same page.
- GraphQL Source – This is the data that GatsbyJS can query via GraphQL.
- Node – A node is called a “model” (according to documentation), which is a shape of how the data looks (Not Node.JS).
- gatsby-node.js – This is where you define your GraphQL sources and it’s located in the project root.
Now we’ve cleared some terms and concepts, let’s’ review Hacker News API.
🔍 Hacker News API Overview
The Official Hacker News API (“HN API” hereafter) exposes top level endpoints for “Top”, “Best”, and “New” stories.
Top level endpoints returns only IDs with no other data associated with it.
returns an array of story IDs
[ 9127232, 9128437, 9130049, 9130144, 9130064, 9130028, 9129409, 9127243, 9128571, ..., 9120990 ]
So you’d need to make a call for each story ID returned from the top level endpoint.
It’s not an optimal design and HN team admits it.
But I am thankful that HN team has provided a public API for their stories.
So with that in mind, let’s move on to creating a source.
🏙 Implementation Steps
Now let’s see how one can turn Hacker News API into a GraphQL Source by wrapping it as a Node by following steps below.
💡 Get all top level story IDs from HN API
Let’s get all top level story IDs from HN API.
There are duplicate stories in Top, New, and Best stories. So let’s cache only distinct story IDs.
Getting all stories is as simple as calling an endpoint with story ID as part of the URL.
You are creating sources for “Top”, “New”, and “Best” stories where “data” contains arrays of story IDs that were fetched in previously.
We’ve now fetched all data, now let’s create story nodes to expose it for GatsbyJS.
💡 Create source nodes
top/new/BestResults from the previous step, and we now use them to create nodes as shown above.
Let’s take a look at the implementation of aptly named,
The shape is defined by
storyNode between line 4~11. Let’s go over each property.
- This is created by combining the type with story ID, where the types are “TopStories”, “BestStories”, and “NewStories”.
- This makes each record distinct so that you can get this record and only this record if you need to.
- This is similar to a primary key if you are familiar with database terms.
- You can’t just use a story ID as an ID, as Top, Best, and New stories can contain duplicate stories, that was the reason for the “type” to make each record distinct globally.
- parent & children
- I honestly do not know 😅 exact use cases for this yet as I could not find any good documentations for them yet.
- The best I found was this documentation but without a concrete example, I had to look at other source plugins like gatsby-source-firebase.
- A shameless begging – I’d appreciate it if you can help me understand why, where, and hows of these parameters
- internal –
- This is how you want the name of GraphQL type
- For three
createStoryNodesmethod calls, I passed “TopStories” for the first call so it’s available as “topStories” in GraphQL.
- storyId – This is self-explanatory, skip!
- item – This contains actual story data but what’s that
Remember that we defined
getStories function but never called?
items is a map of all stories fetched using
getStories as shown below.
The code above fetches stories and caches them into a map, from which we can construct the stories with.
A new Map object (not Array#map) is used for a constant time (O(1)) look up for an efficient data retrieval.
Content Digest (scroll down to “Parameters”) helps GatsbyJS track whether data has been changed or not enabling it to be more efficient.
The implementation of
buildContentDigest is shown below.
It uses to serialize story into a hex representation using MD5 hashing algorithm.
Honestly again, I used the implementation in the documentation as I don’t know much about GatsbyJS’s internal details.
💡 Make it available to GatsbyJS
Now you export the stories source for GatsbyJS at the bottom of
📞 How to call (use the source)
GatsbyJS passes a prop containing
data property, which in turn contains actual data fetched using GraphQL.
Here is the full source code of gatsby-node.js.
👋 Parting Words
The code might not be optimal at fetching data, but static site generator will cache it before generating sites so wouldn’t affect the site performance in the end.
But I’d love to see if you have any suggestions on how to improve it 🙂