srtparser.h is a single header, simple and powerful C++ srt subtitle parsing library that allows you to easily handle, process and manipulate srt subtitle files in your project. It is an extension of Oleksii Maryshchenko’s simple subtitle-parser. It has following features :
-
It is a single header C++ (CPP) file, and can be easily used in your project.
-
Focus on portability, efficiency and simplicity with no external dependency.
-
Wide variety of functions at programmers disposal to parse srt file as per need.
-
Capable of :
-
extracting and stripping HTML and other styling tags from subtitle text.
-
extracting and stripping speaker names.
-
extracting and stripping non dialogue texts.
-
-
Easy to extend and add new functionalities.
srptparser.h is a cross-platform robust srt subtitle parser.
-
Download
srtparser.h
from https://github.com/saurabhshri/simple-yet-powerful-srt-subtitle-parser-cpp -
Include the header file in your program.
#include "lib/srtparser.h"
-
Create SubtitleParserFactory object. Use this factory object to create SubtitleParser object.
SubtitleParserFactory *subParserFactory = new SubtitleParserFactory("inputFile.srt");
SubtitleParser *parser = subParserFactory->getParser();
//to get subtitles
std::vector<SubtitleItem*> sub = parser->getSubtitles();
-
Call appropriate functions to perform parsing.
See demo usage in examples
directory.
The following is a complete list of available parser functions.
Syntax:
Class | Return Type | Function | Description |
---|---|---|---|
SubtitleParserFactory |
SubtitleParserFactory |
|
Creates a SubtitleParserFactory object. Here the inputFile.srt is the path of subtitle file to be parsed. This object is used to create parser. E.g.: |
SubtitleParserFactory |
SubtitleParser |
|
Returns the SubtitleParser object. This object will be used to parse the subtitle file. E.g.: |
SubtitleParser |
std::vector<SubtitleItem*> |
|
Returns the Subtitle as SubtitleItem object. E.g.: |
SubtitleParser |
std::string |
|
Returns the complete file data read as it is from inputFile.srt E.g.: |
SubtitleItem |
long int |
|
Returns the starting time of subtitle in milliseconds. E.g.: |
SubtitleItem |
long int |
|
Returns the ending time of subtitle in milliseconds. E.g.: |
SubtitleItem |
std::string |
|
Returns the starting time of subtitle in srt format. E.g.: |
SubtitleItem |
std::string |
|
Returns the ending time of subtitle in srt format. E.g.: |
SubtitleItem |
std::string |
|
Returns the subtitle text as present in .srt file. E.g.: |
SubtitleItem |
std::string |
|
Returns the subtitle text after processing according to parameters. keepHTML = 1 to stop parser from stripping style tags doNotIgnoreNonDialogues = 1 to stop parser from ignoring and extracting non dialogue texts such as (laughter). doNotRemoveSpeakerNames = 1 to stop parser from ignoring and extracting speaker names By default (0,0,0) values are passed. E.g.: |
SubtitleItem |
int |
|
Returns the count of number of words present in the subtitle dialogue. E.g.: |
SubtitleItem |
std::vector<std::string> |
|
Returns string vector of individual words present in subtitle. E.g.: |
SubtitleItem |
bool |
|
Returns the ignore status. Returns true, if the justDialogue field i.e. subtitle after processing is empty. _E.g.: |
SubtitleItem |
int |
|
Returns the count of number of speakers present in the subtitle. E.g.: |
SubtitleItem |
std::vector<std::string> |
|
Returns string vector of speaker names. E.g.: |
SubtitleItem |
int |
|
Returns the count of number of non dialogue words present in the subtitle. E.g.: |
SubtitleItem |
std::vector<std::string> |
|
Returns string vector of non dialogue words. E.g.: |
SubtitleItem |
int |
|
Returns the count of number of style tags present in the subtitle. E.g.: |
SubtitleItem |
std::vector<std::string> |
|
Returns string vector of style tags. E.g.: |
SubtitleWord |
std::string |
|
Returns the subtitle text as present in .srt file. E.g.: |
While I’ve tried to include examples in the above table, a compilation of all of them together in a single C++ program can be found in example
directory.
Suggestions, features request, PRs, bug reports, bug fixes are welcomed. I’ll be thankful.
Built upon a MIT licensed simple subtitle-parser called LibSub-Parser by Oleksii Maryshchenko.
The original parser had 3 major functions : getStartTime(), getEndTime() and getText().
Rest work done by Saurabh Shrivastava, originally for using this in his GSoC project.