Helpful Functions
The following documentation describes helpful utility functions included with this package.
Once you're familiar with them, get involved!
- Make a discussion post introducing yourself and sharing how you're using Epistemic Network Analysis
- File an issue anytime you encounter a bug or are unable to make the package do what you need
statistics
Main.EpistemicNetworkAnalysis.statistics
— Functionstatistics(model::AbstractENAModel)
Produce a dataframe containing statistics statistics for each dimension of the model embedding
Example
model = ENAModel(data, codes, conversations, units)
stats = statistics(model)
show
Base.show
— Methodshow(model::AbstractENAModel)
Display text summarizing a model's configuration and summary statistics.
Example
model = ENAModel(data, codes, conversations, units)
show(model)
loadExample
Main.EpistemicNetworkAnalysis.loadExample
— FunctionloadExample(name::AbstractString)
Load an example dataset as a DataFrame
Datasets
loadExample("shakespeare")
: Loads the Shakespeare dataset, containing data on two plays, "Hamlet" and "Romeo and Juliet"loadExample("transition")
: Loads the Telling Stories of Transitions dataset, containing metadata and codes only, due to the sensitive nature of the underlying textloadExample("toy")
: Loads a minimal toy example, reproduced below
Group,Convo,Unit,Line,A,B,C
Red,1,X,1,0,0,1
Red,1,Y,2,1,0,0
Blue,1,Z,3,0,1,1
Blue,1,W,4,0,0,0
Red,1,X,5,0,0,1
Red,2,X,1,1,0,0
Red,2,Y,2,1,0,0
Blue,2,Z,3,0,1,1
Blue,2,W,4,0,0,0
Red,2,X,5,1,0,0
Loading Your Own Data
To load your own datasets, use DataFrame
and CSV.File
, which requires the DataFrames
and CSV
packages
using Pkg
Pkg.add("DataFrames")
using DataFrames
Pkg.add("CSV")
using CSV
data = DataFrame(CSV.File("filename_here.csv"))
deriveAnyCode!
Main.EpistemicNetworkAnalysis.deriveAnyCode!
— FunctionderiveAnyCode!(
data::DataFrame,
newColumnName::Symbol,
oldColumnNames...
)
Add a new code column to data
, derived from existing codes. The new code will be marked present where any of the old codes are present
Example
deriveAnyCode!(data, :Food, :Hamburgers, :Salads, :Cereal)
deriveAllCode!
Main.EpistemicNetworkAnalysis.deriveAllCode!
— FunctionderiveAllCode!(
data::DataFrame,
newColumnName::Symbol,
oldColumnNames...
)
Add a new code column to data
, derived from existing codes. The new code will be marked present only where all of the old codes are present on the same line
Example
deriveAllCode!(data, :ObservingStudentsLearning, :Observing, :Students, :Learning)
to_xlsx
Main.EpistemicNetworkAnalysis.to_xlsx
— Functionto_xlsx(filename::AbstractString, model::AbstractENAModel)
Save a model to the disk as an Excel spreadsheet, useful for sharing results with others
See also serialize for saving models in a more efficient format that can be reloaded into Julia using deserialize
Note: a from_xlsx
function does not exist, but is planned. The difficulty is that Excel data is at root a human-readable string format, and some components of some models are difficult to represent reliably as human-readable strings
pointcloud
Main.EpistemicNetworkAnalysis.pointcloud
— Functionpointcloud(
model::AbstractENAModel;
ndims::Int=nrow(model.points),
mode::Symbol=:wide,
z_norm::Bool=false,
metadata::Vector{Symbol}=Symbol[]
)
Produce a point cloud matrix from a model's plotted points and optional additional metadata columns, for preparing data to pass to other packages, e.g., for machine learning.
Arguments
Required:
model
: The ENA model to produce a point cloud from
Optional:
ndims
: The number of dimensions from the ENA model's embedding to include in the point cloud. The firstndim
dimensions will be included. By default, all dimensions will be includedmode
: The orientation of the point cloud, either in:wide
format (default) or:tall
format. In wide format, the point cloud'sX
matrix's rows will correspond to features. In tall format, they will correspond to units.z_norm
: Whether to normalize the point cloud's features (default: false)metadata
: A list of additional names of metadata columns from the model to include in the point cloud. Note, when including additional metadata, it is advised to also setz_norm
to true
Fields
Once the point cloud is constructed, it will have the following fields:
X
: A matrix containing the point cloud data, in either wide or tall formatfeature_names
: A vector of the names of the features included in the point cloud. Whenmode
is:wide
,feature_names
corresponds to the rows ofX
. Whenmode
is:tall
, it corresponds to the columns ofX
instead.unit_names
: A vector of the IDs of the units included in the point cloud. Whenmode
is:wide
,unit_names
corresponds to the columns ofX
. Whenmode
is:tall
, it corresponds to the rows ofX
instead.z_normed
: A boolean representing whether the point cloud was normalizedz_means
andz_stds
: Whenz_normed
is true, these are vectors of the original means and standard deviations of the features of the point cloud
Example
# Wide format DataFrame
pc = pointcloud(model)
df = DataFrame(pc.X, pc.unit_names)
# Tall format DataFrame
pc = pointcloud(model, mode=:tall)
df = DataFrame(pc.X, pc.feature_names)
# ndims, metadata, and z_norm
pc = pointcloud(model, ndims=4, mode=:tall, metadata=[:Act], z_norm=true)
df = DataFrame(pc.X, pc.feature_names)