Helpful Functions

The following documentation describes helpful utility functions included with this package.

Once you're familiar with them, get involved!

  1. Make a discussion post introducing yourself and sharing how you're using Epistemic Network Analysis
  2. File an issue anytime you encounter a bug or are unable to make the package do what you need

statistics

Main.EpistemicNetworkAnalysis.statisticsFunction
statistics(model::AbstractENAModel)

Produce a dataframe containing statistics statistics for each dimension of the model embedding

Example

model = ENAModel(data, codes, conversations, units)
stats = statistics(model)
source

show

Base.showMethod
show(model::AbstractENAModel)

Display text summarizing a model's configuration and summary statistics.

Example

model = ENAModel(data, codes, conversations, units)
show(model)
source

loadExample

Main.EpistemicNetworkAnalysis.loadExampleFunction
loadExample(name::AbstractString)

Load an example dataset as a DataFrame

Datasets

  • loadExample("shakespeare"): Loads the Shakespeare dataset, containing data on two plays, "Hamlet" and "Romeo and Juliet"
  • loadExample("transition"): Loads the Telling Stories of Transitions dataset, containing metadata and codes only, due to the sensitive nature of the underlying text
  • loadExample("toy"): Loads a minimal toy example, reproduced below
Group,Convo,Unit,Line,A,B,C
Red,1,X,1,0,0,1
Red,1,Y,2,1,0,0
Blue,1,Z,3,0,1,1
Blue,1,W,4,0,0,0
Red,1,X,5,0,0,1
Red,2,X,1,1,0,0
Red,2,Y,2,1,0,0
Blue,2,Z,3,0,1,1
Blue,2,W,4,0,0,0
Red,2,X,5,1,0,0

Loading Your Own Data

To load your own datasets, use DataFrame and CSV.File, which requires the DataFrames and CSV packages

using Pkg
Pkg.add("DataFrames")
using DataFrames

Pkg.add("CSV")
using CSV

data = DataFrame(CSV.File("filename_here.csv"))
source

deriveAnyCode!

Main.EpistemicNetworkAnalysis.deriveAnyCode!Function
deriveAnyCode!(
    data::DataFrame,
    newColumnName::Symbol,
    oldColumnNames...
)

Add a new code column to data, derived from existing codes. The new code will be marked present where any of the old codes are present

Example

deriveAnyCode!(data, :Food, :Hamburgers, :Salads, :Cereal)
source

deriveAllCode!

Main.EpistemicNetworkAnalysis.deriveAllCode!Function
deriveAllCode!(
    data::DataFrame,
    newColumnName::Symbol,
    oldColumnNames...
)

Add a new code column to data, derived from existing codes. The new code will be marked present only where all of the old codes are present on the same line

Example

deriveAllCode!(data, :ObservingStudentsLearning, :Observing, :Students, :Learning)
source

to_xlsx

Main.EpistemicNetworkAnalysis.to_xlsxFunction
to_xlsx(filename::AbstractString, model::AbstractENAModel)

Save a model to the disk as an Excel spreadsheet, useful for sharing results with others

See also serialize for saving models in a more efficient format that can be reloaded into Julia using deserialize

Note: a from_xlsx function does not exist, but is planned. The difficulty is that Excel data is at root a human-readable string format, and some components of some models are difficult to represent reliably as human-readable strings

source

pointcloud

Main.EpistemicNetworkAnalysis.pointcloudFunction
pointcloud(
    model::AbstractENAModel;
    ndims::Int=nrow(model.points),
    mode::Symbol=:wide,
    z_norm::Bool=false,
    metadata::Vector{Symbol}=Symbol[]
)

Produce a point cloud matrix from a model's plotted points and optional additional metadata columns, for preparing data to pass to other packages, e.g., for machine learning.

Arguments

Required:

  • model: The ENA model to produce a point cloud from

Optional:

  • ndims: The number of dimensions from the ENA model's embedding to include in the point cloud. The first ndim dimensions will be included. By default, all dimensions will be included
  • mode: The orientation of the point cloud, either in :wide format (default) or :tall format. In wide format, the point cloud's X matrix's rows will correspond to features. In tall format, they will correspond to units.
  • z_norm: Whether to normalize the point cloud's features (default: false)
  • metadata: A list of additional names of metadata columns from the model to include in the point cloud. Note, when including additional metadata, it is advised to also set z_norm to true

Fields

Once the point cloud is constructed, it will have the following fields:

  • X: A matrix containing the point cloud data, in either wide or tall format
  • feature_names: A vector of the names of the features included in the point cloud. When mode is :wide, feature_names corresponds to the rows of X. When mode is :tall, it corresponds to the columns of X instead.
  • unit_names: A vector of the IDs of the units included in the point cloud. When mode is :wide, unit_names corresponds to the columns of X. When mode is :tall, it corresponds to the rows of X instead.
  • z_normed: A boolean representing whether the point cloud was normalized
  • z_means and z_stds: When z_normed is true, these are vectors of the original means and standard deviations of the features of the point cloud

Example

# Wide format DataFrame
pc = pointcloud(model)
df = DataFrame(pc.X, pc.unit_names)

# Tall format DataFrame
pc = pointcloud(model, mode=:tall)
df = DataFrame(pc.X, pc.feature_names)

# ndims, metadata, and z_norm
pc = pointcloud(model, ndims=4, mode=:tall, metadata=[:Act], z_norm=true)
df = DataFrame(pc.X, pc.feature_names)
source