Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SoniBooks #8

Open
zakkariyaa opened this issue Jun 5, 2023 · 0 comments
Open

SoniBooks #8

zakkariyaa opened this issue Jun 5, 2023 · 0 comments

Comments

@zakkariyaa
Copy link

SoniBooks: A Delight For the Busy Reader

Convert PDF books into Audio books

Alt Text

( i know i know he is not listening to a book, i just love this scene)

Problem:

The problem we aim to solve is the limited accessibility and convenience of reading PDF books. Many individuals prefer listening to books rather than reading them, whether due to visual impairments, multitasking preferences, or personal convenience. However, converting PDFs to audiobooks in a user-friendly and efficient manner is currently a challenge.

Stakeholders:

  1. Users: Individuals who prefer listening to books or have difficulty reading due to visual impairments.
  2. Educational Institutions: Teachers and students who can benefit from audio versions of textbooks and study materials.
  3. Content Creators: Authors and publishers who want to extend the reach and accessibility of their digital books.

Other Considerations:

  • Ensuring a seamless and intuitive user experience for uploading and converting PDF files.
  • Maintaining the audio quality and accurate conversion of text to speech.
  • Implementing robust error handling and edge case scenarios during the conversion process.

Current Solutions:

  • Voice Dream Reader: This app allows users to import various document formats, including PDF, and converts them into high-quality audio for listening. It offers features like adjustable reading speed, voice selection, and text highlighting. (Available for iOS and Android)

  • NaturalReader: NaturalReader offers a PDF to audio conversion feature where users can upload PDF files and convert them into natural-sounding speech. The app provides customization options for voice selection, reading speed, and even supports multiple languages. (Available for iOS, Android, Windows, and macOS)

Existing solutions for converting PDFs to audiobooks are restrictive in the sense they lack free versions. Most of them are paid.

Technology Stack:

Frontend:

  • React Native: A cross-platform JavaScript framework for building mobile applications.
  • React Navigation: A library for implementing navigation and routing in the app.
  • Redux: A predictable state container for managing application state.

Backend:

  • Node.js: A JavaScript runtime for building server-side applications.
  • Express.js: A minimal and flexible web application framework for handling backend API requests.
  • Multer: A middleware for handling file uploads in Node.js.

PDF Conversion:

  • pdf2json: A library to parse PDF files and extract text content.

Text-to-Speech:

  • Google Cloud Text-to-Speech API: A cloud-based API for converting text into high-quality speech.

Data Storage:

  • Ideally, Firebase. Since it handles the whole backend: storage, user management, authentication etc
  • MongoDB: A NoSQL document database for storing user data and converted audiobooks.

Deployment and Hosting:

  • Flyio

Stretch Goal:

This is the fun part. I admit this is what i first thought of and wanted to do, but its implementation is rather complex and requires knowledge of even more complex technologies. In a nutshell, we want to allow users to select their choice of voice. Maybe you want to sleep to James Earl Jones narrating 'The Shining' or Meryl Streep reading '1984' or Morgan Freeman going through 'Diary of Wimpy Kid'.

Shawshank Redemption

In the future, as a stretch goal, we can explore the integration of machine learning techniques to provide users with the ability to select specific voices for the audiobook narration. This would involve leveraging Natural Language Processing (NLP) libraries and training models to recognize and synthesize voices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant