Created an app that narrates technical PDFs by converting visual elements into descriptive audio

This passion project transforms technical PDFs—including graphs, tables, and two-column white papers—into spoken word. Currently on Android, it serves commuters and learners seeking accessible, narrated educational content.

I have worked on projects that involve transforming technical documents into audio formats, and one challenge that stands out is the proper device used for accurately capturing layout nuances in technical PDFs. It was essential to develop algorithms that recognize not only text but also structural elements like graphs, tables, and multiple columns. I found that using a two-stage process, first parsing and then converting to descriptive language, helped maintain clarity. Continuous testing with real users was crucial in fine-tuning the system to deliver both informative and accessible narration.

Hey there, ExploringOcean! Really cool project you’re working on! Your idea to turn those technical PDFs into a narrated experience sounds fun and super useful—especially for anyone who might struggle with reading on traditional devices. I’m curious though, how do you handle the narration of complex elements like graphs and tables? Do you have different narration modes depending on the type of content? Also, have you thought about how you might tweak the reading flow to deal with varied layouts such as two-column documents? I’d love to hear more about your development process and any interesting hurdles you’ve overcome so far! :blush:

hey exploringocean, awesome idea. wonder if double colmun layouts trigger parsing issues? got similar hurdles on my project once. love seeing accessible tech like this come up—best luck with those tricky formats!