Rogue Scholar

Published December 8, 2016

Author Jeroen Ooms

A few weeks ago we announced the first release of the tesseract package: a high quality OCR engine in R. We have now released an update with extra features. Installing Training Data As explained in the first post, the tesseract system is powered by language specific training data. By default only English training data is installed. Version 1.3 adds utilities to make it easier to install additional training data.

PackagesCommonmarkMarkdownComputer and Information Sciences

High Performance CommonMark and Github Markdown Rendering in R

https://doi.org/10.59350/h5crj-yst34

Published December 2, 2016

Author Jeroen Ooms

This week the folks at Github have open sourced their fork of libcmark (based on the extensive PR by Mathieu Duponchelle), which they use to render markdown text within documents, issues, comments and anything else on the Github website.

GeospatialComputer and Information Sciences

The rOpenSci geospatial suite

https://doi.org/10.59350/k0s8h-2bn12

Published November 22, 2016

Author Scott Chamberlain

Geospatial data - data embedded in a spatial context - is used across disciplines, whether it be history, biology, business, tech, public health, etc. Along with community contributors, we’re working on a suite of tools to make working with spatial data in R as easy as possible. If you’re not familiar with geospatial tools, it’s helpful to see what people do with them in the real world.

HttpTech NotesComputer and Information Sciences

fauxpas - HTTP conditions package

https://doi.org/10.59350/wn9k6-33630

Published November 18, 2016

Author Scott Chamberlain

HTTP, or Hypertext Transfer Protocol is a protocol by which mostof us interact with the web. When we do requests to a website in a browseron desktop or mobile, or get some data from a server in R, all of that isusing HTTP.

PackagesTesseractComputer and Information Sciences

The new Tesseract package: High Quality OCR in R

https://doi.org/10.59350/r1f37-rc724

Published November 16, 2016

Author Jeroen Ooms

Optical character recognition (OCR) is the process of extracting written or typed text from images such as photos and scanned documents into machine-encoded text. The new rOpenSci package tesseract brings one of the best open-source OCR engines to R. This enables researchers or journalists, for example, to search and analyze vast numbers of documents that are only available in printed form.

CommunityMeetingsComputer and Information Sciences

Chat with the rOpenSci team at upcoming meetings

https://doi.org/10.59350/cqjd9-hv894

Published November 9, 2016

Author Stefanie Butland

You can find members of the rOpenSci team at various meetings and workshops around the world. Come say ‘hi’, learn about how our packages can enable your research, or about our onboarding process for contributing new packages, discuss software sustainability or tell us how we can help you do open and reproducible research. Where’s rOpenSci?

APITech NotesComputer and Information Sciences

crul - an HTTP client

https://doi.org/10.59350/hnby5-jr334

Published November 9, 2016

Author Scott Chamberlain

A new package crul ison CRAN. crul is another HTTP client for R, but is relatively simplifiedcompared to httr, and is being builtto link closely with webmockr and vcr. webmockr andvcr are packages ported from Ruby’s webmockand vcr, respectively.They both make mocking HTTP requests really easy. A major use case for mocking HTTP requests is for unit tests.

ClimateTech NotesComputer and Information Sciences

Parse NOAA Integrated Surface Data Files

https://doi.org/10.59350/typ3j-gyd24

Published November 3, 2016

Author Scott Chamberlain

A new package isdparser ison CRAN. isdparser was in part liberated from rnoaa,then improved. We’ll use isdparser in rnoaa soon. isdparser does not download files for you from NOAA’s ftp servers. Thepackage focuses on parsing the files, which are variable length ASCII stringsstored line by line, where each line has some mandatory data, and any amountof optional data.

CommunityCommunity CallEventsGovernanceComputer and Information Sciences

Community Call v12 - How do I create a code of conduct for my event/lab/codebase?

https://doi.org/10.59350/pna7v-1dn22

Published October 31, 2016

Author Stefanie Butland

In order to facilitate a transformation towards open and reproducible research, rOpenSci is building and improving not only the technical infrastructure, but the social infrastructure as well. To support this, occasionally a Community Call will focus on a topic that reflects the values of rOpenSci.

GpgCryptoTech NotesComputer and Information Sciences

Encryption and Digital Signatures in R using GPG

https://doi.org/10.59350/h84gg-mr646

Published October 19, 2016

Author Jeroen Ooms

A new package gpg has appeared on CRAN. From the package description: The package features a beautiful vignette to get you started with using GPG in R. Some highlights from the vignette below. Example: encryption Suppose we want to send an email Glenn Greenwald containing top secret information. His homepage at the intercept shows Greenwalds GPG fingerprint.

rOpenSci - open tools for open science

Tesseract Update: Options and Languages

High Performance CommonMark and Github Markdown Rendering in R

The rOpenSci geospatial suite

fauxpas - HTTP conditions package

The new Tesseract package: High Quality OCR in R

Chat with the rOpenSci team at upcoming meetings

crul - an HTTP client

Parse NOAA Integrated Surface Data Files

Community Call v12 - How do I create a code of conduct for my event/lab/codebase?

Encryption and Digital Signatures in R using GPG