Skip to content

Instantly share code, notes, and snippets.

@macoj
Last active September 18, 2016 22:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save macoj/d7685859f9a190333f93cc4606a2f797 to your computer and use it in GitHub Desktop.
Save macoj/d7685859f9a190333f93cc4606a2f797 to your computer and use it in GitHub Desktop.
how to merge many points of a scatterplot made with matplotlib

Approach A

  1. Export the plot as SVG
   plt.savefig("plot.svg")
  1. Open plot.svg with an editor and include in the definitions the following:
<defs>
   <filter id="shadow">
      <feOffset dx="0" dy="0"/>
      <feMerge>
        <feMergeNode/>
        <feMergeNode in="SourceGraphic"/>
      </feMerge>
    </filter>
  </defs>
  1. Then, for each group of point replace:
<g clip-path="url(#pde120285b9)">

to this:

<g filter="url(#shadow)" clip-path="url(#pde120285b9)">    

The clip-path may vary.

  1. Finally, use rsvg-convert to convert svg to pdf:
rsvg-convert -d 600 -p 600 -f pdf -o output.pdf input.svg

where 600x600 represents the dpi in each axis.

Approach B

  1. For each plot that you want to rasterize use:
plt.plot(..., rasterized=True)
  1. Save setting dpi:
plt.savefig('dots.pdf', dpi=600) 

Differences of the approaches

Using these two approaches in a plot with more than 100k+ points:

1.2M original_file.pdf
160K approach_a.pdf
272K approach_b.pdf

Verdict

Although the first approach (in this test) results in a smaller file, it's quite easier to use the Approach B than Approach A. Thus, Approach B will be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment