Computer and Information SciencesHugo

rOpenSci - open tools for open science

rOpenSci - open tools for open science
Open Tools and R Packages for Open Science
Home PageJSON Feed
language
Published
Author Jeroen Ooms

Last week Google and friends released the new major version of their OCR system: Tesseract 4. This release builds upon 2+ years of hard work and has completely overhauled the internal OCR engine. From the tesseract wiki: We have now also updated the R package tesseract to ship with the new Tesseract 4 on MacOS and Windows. It uses the new engine by default, and the results are extremely impressive!

Published

While many people groan at the thought of participating in a group ice breaker activity, we’ve gotten consistent feedback from people who have been to recent rOpenSci unconferences. We’ve had lots of requests for a detailed description of how we do it. This post shares our recipe, including a script you can adapt, a reflection on its success, examples of how others have used it, and some tips to remember.

Published

rOpenSci’s software engineer / postdoc Jeroen Ooms will explain what images are, under the hood, and showcase several rOpenSci packages that form a modern toolkit for working with images in R, including opencv, av, tesseract, magick and pdftools. 🕘 Thursday, November 15, 2018, 10-11AM PST; 7-8PM CET (find your timezone) ☎️ Find all details for joining the call on our Community Calls page.Everyone is welcome. No RSVP needed.

Published
Author Scott Chamberlain

pubchunks is a package grown out of the fulltext package. fulltextprovides a single interface to many sources of full text scholarly articles. Aspart of the user flow in fulltext there is an extraction step where fulltext::chunks()pulls parts of articles out of XML format article files.

Published
Author Thomas Klebel

Every R package has its story. Some packages are written by experts, some bynovices. Some are developed quickly, others were long in the making. This is thestory of jstor, a package which I developed during my time as a student ofsociology, working in a research project on the scientific elite withinsociology.

Published

Proper identification of individuals is crucial for acknowledging andstudying their scientific work, be it journal articles or pieces ofsoftware. In this tech note, one year after CRAN started supportingORCIDs, we shall explain why and how to use unique author identifiers inDESCRIPTION files. Why use ORCIDs on CRAN? When analyzing the authorship of CRAN packages, one can look at authors’names and email addresses.

Published
Author Jeroen Ooms

At rOpenSci we are developing on a suite of packages that expose powerful graphics and imaging libraries in R. Our latest addition is av – a new package for working with audio/video based on the FFmpeg AV libraries. This ambitious new project will become the video counterpart of the magick package which we use for working with images.

Published

Do you have code that accompanies a research project or manuscript? How do you review and archive that code before you submit a paper? Our next Community Call will present different perspectives on this hot topic, with plenty of time for Q&A. What’s the culture of the group around feedback and code collaboration? What are the use cases? What are some practices that can adopted?

Published
Author Rafael Pilliard Hellwig

Background Surveys are ubiquitous in the social sciences, and the best of them are meticulously planned out. Statisticians often decide on a sample size based on a theoretical design, and then proceed to inflate this number to account for “sample losses”. This ensures that the desired sample size is achieved, even in the presence of non-response.