Apache Superset: Export Data to Excel

Alex Gordienko
Geek Culture
Published in
4 min readMay 22, 2021

--

Apache Superset 1.1.0 with ‘Export to XLSX’ functionality enabled

Apache Superset is a modern BI tool that helps you easily create rich charts and powerful dashboards by combining information on dashboards from various databases connected via SQLAlchemy. Gathered from different sources into a chart, data can be simply exported then to a comma-separated file (CSV) from the Superset right out of the box. That CSV file could be opened via many spreadsheet software including industry-leading Microsoft Excel. But there is at least one problem that could bring you to the point at which you will want to patch your Superset installation to be able to export data from charts to the XLSX format. And that problem is Microsoft Excel’s auto-recognition of values as dates.

Let’s imagine you have a simple table exported from Superset to a CSV file. Here is its content:

Name,Building,Street,ID
Liubov,23/1,Pretty St.,01-04
Mike,17,New St.,16-7

And here is how you would probably think it would look in your spreadsheet:

┌────────┬──────────┬────────────┬───────┐
│ Name │ Building │ Street │ ID │
├────────┼──────────┼────────────┼───────┤
│ Liubov │ 23/1 │ Pretty St. │ 01-04 │
│ Mike │ 17 │ New St. │ 16-7 │
└────────┴──────────┴────────────┴───────┘

If you will try to import this CSV file into Excel, you will be surprised. It is not easy to find the table’s initial view correctly without additional user actions. Your table will probably look like this by default:

┌────────┬──────────┬────────────┬────────┐
│ Name │ Building │ Street │ ID │
├────────┼──────────┼────────────┼────────┤
│ Liubov │ 23-Jan │ Pretty St. │ 01-Apr
│ Mike │ 17 │ New St. │ 16-Jul
└────────┴──────────┴────────────┴────────┘

As you can see, Excel is trying to interpret some numbers as dates where it’s possible based on your system's regional settings. And this problem could lead to errors in decisions people make on the data they see in the table.

There are few ways how to fix this. You could teach your users to open CSV files with additional actions: choosing an ‘Import’ menu item, splitting tables into columns, setting the data type for each column, and so on. This could prevent Excel from auto-interpreting numbers as dates and using user-selected data types. Another solution is switching to different spreadsheet software, Only Office for example, which will ask how to interpret data in CSV file at the opening. But I have found all the options are not good enough as they lead to additional costs for the user consultations or changes in the software list you support in your company (and maybe additional license payments). Therefore, I went through a different pathway.

My solution was to patch the official Apache Superset package by adding a new button for exporting data to XLSX files which open by Excel without data loss and without additional user actions at the same time. XLSX is a way more customizable format than CSV. We can save in it not only values but data types as well.

I use docker, hence I prepared a new Dockerfile to build an intermediate container for generating Node assets with an additional button in the Superset’s interface and copy those assets with corrected backend Python files into a new image. You can find all required files for building your own image in this repository and run it the same way you use the official Superset image. I am not sure about how Superset will grow in the future. I have decided to lock patches to versions in the folder structure as well as in the Dockerfile. If you will find this solution works well for another version without any changes, or you will have time to edit the code to work with another version, please create a new pull request in the repository to help other people have a working solution for exporting data from Apache Superset to XLSX files.

Upd 1/23/22: The custom version supports versions 1.3.2 and 1.4.0 as well as the previous 1.1.0. Also, the run_locally utility was included in the GitHub repo to simplify the process of local running.

Upd 2/11/22: The custom version supports version 1.4.1 as well as 1.1.0, 1.3.2, and 1.4.0. run_locally utility default Superset version is now 1.4.1.

I hope you’ve enjoyed while reading this article. Please, follow me on Medium, GitHub, Twitter, and LinkedIn.

Please, read additional information about running Superset in the Docker container in the previous article. Also, check the article about adding a custom security manager to your Superset installation here.

--

--