Archive for the ‘MathML’ Category

RDC Home Movies! +1

MileMarker 36: Understanding The Symbols

Greetings from #rdcHQ! We have been so busy in our Codes of the World track that there has been little time to create tutorial videos … however, we are pleased to announce that Nathan Malamud has created his third video in his MathML tutorial series: “Understanding Operators and Symbols.”

The word from the editing room is that a fourth episode is most definitely planned and eventually the videos will be compiled into a larger film to give folks a better idea of all of the cool things that happen at Rural Design Collective Headquarters (#rdcHQ). It is impossible to keep tabs on it all – but we most certainly try!

MileMarker 37: Create A Limited Edition Poster! :-)

In other MathML news – we will be creating a limited edition run of our MathML Poster for our Rural Design Collective Launch Party happening Labor Day Weekend.

{~beaux libre*} MathML Poster by Rebecca Hargrave Malamud and the The Rural Design Collective

We’ll have a lot to celebrate … Stay Tuned!

Codes of the World Meetup

Greetings from #rdcHQ! This week in our “Codes of the World” Track, we began coding our first set of MathML equations and are off to a great start … we have roughly 25% of all of the equations for our summer track coded using Amaya! We even have a couple of cool videos showing what we have learned to date.

MileMarker 3: Installing Amaya

The first video tells you everything you need to know about downloading and installing Amaya -

MileMarker 4: Coding in MathML

The second video shows how to code a basic equation – complete with a Rock-n-Roll soundtrack :-)

Special thanks to Nathan Malamud for putting these videos together!

We’ll be checking the MathML in a web browser next to be certain that all of the equations render properly and to make sure the code is valid. Next week, we’ll be learning more about the mathematical symbols and what they mean, as well as getting into the source and understanding the presentation elements of MathML. We’re pretty excited at the progress!

RDC Veteran Jasper Shumaker-Pruitt is pitching in this summer doing some preliminary research on ways to automate steps in our workflow. We’ll be exploring some pretty cool new tools in the process. We’ll report our findings here.

GO #rdcHQ Go!

DocGen / Gone Hybrid

Our DocGen Track continues …
with explorations in the PDF space. We have been experimenting with a couple of programs that are bridging the gap where Tesseract falls short. The first of these two programs is the newly released Nuance PDF Converter for Mac 3.0 which does an excellent job deciphering even the poorest quality of scans and provides multilingual support. Although this is a commercial product, it is certainly worth the money. It is a standalone module built on OmniPage, the industry leader in the OCR space for years. It does two things really well: 1) provide accurate optical character recognition (as we mentioned before), and 2) create first-generation interactive PDFs. These pros far outweigh the cons such as the software only being available on the Mac platform (Jasper is already researching ways to get the program to run on Linux using an emulator).

Interactive PDFs

In short, interactive PDFs provide the online equivalent of paper forms that one can fill out on the computer and send electronically. These PDFs add a layer on top of an Image PDF that can include internal hyperlinks to other sections of the document, external links to the World Wide Web or an object on one’s computer, and form elements that can be filled out, digitally signed, and sent to a designated email address. Pretty neat!

Testing the tab order on an Interactive PDF (double-click to play the movie)

Hybrid PDFs

Another interesting program that we are working with is LibreOffice, particularly with its “Hybrid PDF” functionality which delivers on the true promise of a portable document format. A “Hybrid PDF” is a PDF that anyone can view, but which includes the source document embedded within it so that people with modern office suites can also edit it if they want to. You can read more about Hybrid PDFs in this PDF about creating Hybrid PDFs (See also: “The Magic of Editable PDFs” from the author of that document).

So, we definitely have a lot of interesting tools to work with to complete our project – and although we were unable to use Tesseract to produce all of our document objects, there is a lot of interesting work being done in that space. Most notably, newly hatched open source efforts with minimal documentation which we are going to keep a watch on in the months ahead. Our solution for the CCR will be a hybrid model – and despite our open source ethos, we believe in a multimodal approach to solving a problem and using the best tool for the project at hand @ #rdcHQ.

Onward!

#rdcHQ MathML / SVG / DocGen Update

Busy, busy, busy at #rdcHQ We successfully converted 311 MathML equations using SVGMath. You can view our progress here. The original equation is on the left, and our final SVG is on the right (all of these equations are now coded in MathML which we use as the source for our conversion). We are currently in the process of checking our work against the original graphic. We also kicked off the DocGen portion of our program by batch-processing 200+ basic text content objects in Tesseract. The results are promising, and we are exploring ways we can train Tesseract for better results.

SVG Update: Autotrace (Outline Method)

Our SVG team successfully completed work on Labels and Flowcharts. We are also cleaning up work on our Map collection that we converted to SVG using the Autotrace function. We’re currently cleaning up symbols and typography … more updates soon … Stay Tuned!

Return top