Node.js: Reading content from PDF and CSV files

Last updated on March 19, 2022 A Goodman Loading... 2 comments


Node.js is non-blocking I/O so it is efficient when working with files even super-large files. PDF, which stands for Portable Document Format, is used to display text and images independently with software and hardware. CSV or Comma-separated Values is a file format that stores tabular data (numbers and text) in plain text.

This article will show you how to read content from PDF and CSV files using Node.js through 2 end-to-end examples.

The PDF file we’ll use for testing in this tutorial:

And here’s the CSV:

Working with PDF files

We will use a library named pdf-parse to do the job.

1. Copy the PDF from the link above to the folder where you want your example project to live the create a file named index.js.

2. Install pdf-parse by running this command:

npm install pdf-parse --save

Our file structure:

├── dummy.pdf
├── index.js
├── package-lock.json
└── package.json
└── node_modules

3. Add the following to index.js:

const fs = require('fs');
const pdfParse = require('pdf-parse');

const readPdf = async (uri) => {
    const buffer = fs.readFileSync(uri);
    try {
        const data = await pdfParse(buffer);

        // The content
        console.log('Content: ', data.text); 

        // Total page
        console.log('Total pages: ', data.numpages);

        // File information
        console.log('Info: ',;
        throw new Error(err);

// Testing
const DUMMY_PDF = './dummy.pdf';

4. Run the code and check the output in the console. It should look like this:


Dummy PDF file
Total pages:  1
Info:  {
  PDFFormatVersion: '1.4',
  IsAcroFormPresent: false,
  IsXFAPresent: false,
  Author: 'Evangelos Vlachogiannis',
  Creator: 'Writer',
  Producer: ' 2.1',
  CreationDate: "D:20070223175637+02'00'"

Reading CSV File

We’ll use fast-csv to extract data from a CSV file. It’s very lightweight but powerful and works well with both small and very big CSV files.

1. Create a new folder for this example then create a new file named index.js inside it.

2. Download the CSV file from the link above to the root directory of the project. Its data is simple as below:

1,John Doe,40
4,Joe Biden,80
5,Ryo Hanamura,35

3. Install fast-csv:

npm i fast-csv

4. Add this code into index.js:

const fs = require('fs');
const path = require('path');
const csv = require('fast-csv');

// This function reads data from a given CSV file
const readCSV = (filePath) => {
  const readStream = fs.createReadStream(filePath);
  const data = [];
    .on('data', (row) => {
      console.log('Id:', row[0]);
      console.log('Name:', row[1]);
      console.log('Age:', row[2]);
    .on('end', (rowCount) => {
      console.log(`${rowCount} rows has been parsed!`);

      // Do something with the data you get
    .on('error', (error) => console.error(error));

// Try it
const myFile = path.resolve(__dirname, 'kindacode.csv');

5. Run the code and see the output:

Id: Id
Name: Name
Age: Age

Id: 1
Name: John Doe
Age: 40

Id: 2
Name: Kindacode
Age: 41

Id: 3
Name: Voldermort
Age: 71

Id: 4
Name: Joe Biden
Age: 80

Id: 5
Name: Ryo Hanamura
Age: 35

6 rows has been parsed!
  [ 'Id', 'Name', 'Age' ],
  [ '1', 'John Doe', '40' ],
  [ '2', 'Kindacode', '41' ],
  [ '3', 'Voldermort', '71' ],
  [ '4', 'Joe Biden', '80' ],
  [ '5', 'Ryo Hanamura', '35' ]


At this point, you should have a better sense and feel more confident when working with PDF and CSV files. Node.js is powerful and awesome. If you would like to learn more about that Javascript runtime, have a look at the following articles:

You can also check out our Node.js category page for the latest tutorials and examples.

Notify of
Inline Feedbacks
View all comments
Victor Karanja Mbugua
Victor Karanja Mbugua
1 year ago

How do I do this with a password-protected pdf?

Related Articles