pdf2htmlEX is a free software to convert PDF to HTML format without loosing text and formatting. It is a command line tool which works very fast and accurately. The output size of the HTML file is almost similar or higher than from the original PDF file (if contains images), but you can open the output HTML file easily with any web browser. As HTML is more flexible than PDF, so converting PDF to HTML will help you a lot. And of course, you can also edit it easily.
Normally, such conversions were useful to online present HTML files, but we recently covered a free service to broadcast PDF files. But if you still have a need to convert PDF to HTML, this software will do the trick for you.
How to convert PDF to HTML by using pdf2htmlEX:
First download the zip file and extract it to any folder. Now open the extracted folder and press and hold Shift key and right click your mouse in the blank area. You will see a context menu containing “Open command window here” command. Click it and a command window will open here. Now copy the PDF file to the extracted directory (otherwise you have to provide the complete path) and type the following command at command prompt.
pdf2htmlEX <input.pdf> <output.html>
After pressing Enter key, it will start converting your PDF into HTML. The above is the general example with all the default values. If you want to apply different options, you can do this too. The program supports various options. For e.g. you can convert only specific pages, split pages, image quality, image format, embed CSS, embed fonts, embed JavaScript, optimize text and many more. By default it embeds everything in PDF to Single HTML file.
Also all the images will be saved in the output HTML file using inline encoding.
You can take the help any time by using the -h command at the command prompt.
Syntax: pdf2htmlex -h
It will show you the complete list of commands that can be used with this program.
In the below example you can set the range for pages to convert. For e.g. if you want to convert from page 4 to page 10 to HTML, the command will be:
pdf2htmlex --first-page 4 --last-page 10 mypdf.pdf myhtml.html
Some key features of this free PDF to HTML converter are:
- Maintain Fonts and Formatting in the output HTML file.
- Optimize HTML files for web display.
- Range can be set for the output pages.
- Text can be copied and extracted easily.
- Links and Hyperlinks can be handled easily.
- Output will be one single HTML file (if default settings are used).
Conclusion:
Overall pdf2htmlEX is a nice program which helps you to maintain the formatting and text in the output HTML file. As HTML files can be read on any web browser, so no additional program or plugins are required to open the converted PDF. The best part of this program is that it embeds all the data in single HTML file by using inline encoding and no separate folder is created for additional files.