R packages by gesistsa

rio - A Swiss-Army Knife for Data I/O

Streamlined data import and export by making assumptions that the user is probably willing to make: 'import()' and 'export()' determine the data format from the file extension, reasonable defaults are used for data import and export, web-based import is natively supported (including from SSL/HTTPS), compressed files can be read directly, and fast import packages are used where appropriate. An additional convenience function, 'convert()', provides a simple method for converting between file types.

Last updated 4 months ago

csvcsvydatadata-scienceexcelioriosasspssstata

17.10 score 610 stars 74 dependents 7.8k scripts 47k downloads

rtoot - Collecting and Analyzing Mastodon Data

An implementation of calls designed to collect and organize Mastodon data via its Application Program Interfaces (API), which can be found at the following URL: <https://docs.joinmastodon.org/>.

Last updated 2 months ago

mastodonmastodon-api

8.70 score 105 stars 67 scripts 1.1k downloads

oolong - Create Validation Tests for Automated Content Analysis

Intended to create standard human-in-the-loop validity tests for typical automated content analysis such as topic modeling and dictionary-based methods. This package offers a standard workflow with functions to prepare, administer and evaluate a human-in-the-loop validity test. This package provides functions for validating topic models using word intrusion, topic intrusion (Chang et al. 2009, <https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models>) and word set intrusion (Ying et al. 2021) <doi:10.1017/pan.2021.33> tests. This package also provides functions for generating gold-standard data which are useful for validating dictionary-based methods. The default settings of all generated tests match those suggested in Chang et al. (2009) and Song et al. (2020) <doi:10.1080/10584609.2020.1723752>.

Last updated 1 months ago

textanalysistopicmodelingvalidation

7.58 score 55 stars 23 scripts 237 downloads

minty - Minimal Type Guesser

Port the type guesser from 'readr' (so-called 'readr' first edition parsing engine, now superseded by 'vroom').

Last updated 3 months ago

cpp

7.16 score 5 stars 26 dependents 5 scripts 9.3k downloads

adaR - A Fast 'WHATWG' Compliant URL Parser

A wrapper for 'ada-url', a 'WHATWG' compliant and fast URL parser written in modern 'C++'. Also contains auxiliary functions such as a public suffix extractor.

Last updated 1 months ago

url-parsercpp

6.95 score 27 stars 2 dependents 11 scripts 409 downloads

rang - Reconstructing Reproducible R Computational Environments

Resolve the dependency graph of R packages at a specific time point based on the information from various 'R-hub' web services <https://blog.r-hub.io/>. The dependency graph can then be used to reconstruct the R computational environment with 'Rocker' <https://rocker-project.org>.

Last updated 2 months ago

reproducibilityreproducible-research

6.32 score 80 stars 13 scripts 258 downloads

webtrackR - Preprocessing and Analyzing Web Tracking Data

Data structures and methods to work with web tracking data. The functions cover data preprocessing steps, enriching web tracking data with external information and methods for the analysis of digital behavior as used in several academic papers (e.g., Clemm von Hohenberg et al., 2023 <doi:10.17605/OSF.IO/M3U9P>; Stier et al., 2022 <doi:10.1017/S0003055421001222>).

Last updated 4 months ago

webtracking

6.03 score 9 stars 8 scripts 578 downloads

grafzahl - Supervised Machine Learning for Textual Data Using Transformers and 'Quanteda'

Duct tape the 'quanteda' ecosystem (Benoit et al., 2018) <doi:10.21105/joss.00774> to modern Transformer-based text classification models (Wolf et al., 2020) <doi:10.18653/v1/2020.emnlp-demos.6>, in order to facilitate supervised machine learning for textual data. This package mimics the behaviors of 'quanteda.textmodels' and provides a function to setup the 'Python' environment to use the pretrained models from 'Hugging Face' <https://huggingface.co/>. More information: <doi:10.5117/CCR2023.1.003.CHAN>.

Last updated 1 months ago

5.91 score 41 stars 3 scripts 201 downloads

sweater - Speedy Word Embedding Association Test and Extras Using R

Conduct various tests for evaluating implicit biases in word embeddings: Word Embedding Association Test (Caliskan et al., 2017), <doi:10.1126/science.aal4230>, Relative Norm Distance (Garg et al., 2018), <doi:10.1073/pnas.1720347115>, Mean Average Cosine Similarity (Mazini et al., 2019) <arXiv:1904.04047>, SemAxis (An et al., 2018) <arXiv:1806.05521>, Relative Negative Sentiment Bias (Sweeney & Najafian, 2019) <doi:10.18653/v1/P19-1162>, and Embedding Coherence Test (Dev & Phillips, 2019) <arXiv:1901.07656>.

Last updated 2 months ago

bias-detectiontextanalysiswordembeddingcpp

4.80 score 30 stars 14 scripts 501 downloads

webbotparseR - Parse html files containing search engine results

Parse search engine results which have been scraped with the 'WebBot' browser extension <https://github.com/gesiscss/WebBot>.

Last updated 4 months ago

browser-extensionsearch-engine

3.20 score 8 stars 6 scripts