# Helpful Functions

The following documentation describes helpful utility functions included with this package.

Once you're familiar with them, get involved!

- Make a discussion post introducing yourself and sharing how you're using Epistemic Network Analysis
- File an issue anytime you encounter a bug or are unable to make the package do what you need

## statistics

`Main.EpistemicNetworkAnalysis.statistics`

— Function`statistics(model::AbstractENAModel)`

Produce a dataframe containing statistics statistics for each dimension of the model embedding

**Example**

```
model = ENAModel(data, codes, conversations, units)
stats = statistics(model)
```

## show

`Base.show`

— Method`show(model::AbstractENAModel)`

Display text summarizing a model's configuration and summary statistics.

**Example**

```
model = ENAModel(data, codes, conversations, units)
show(model)
```

## loadExample

`Main.EpistemicNetworkAnalysis.loadExample`

— Function`loadExample(name::AbstractString)`

Load an example dataset as a DataFrame

**Datasets**

`loadExample("shakespeare")`

: Loads the Shakespeare dataset, containing data on two plays, "Hamlet" and "Romeo and Juliet"`loadExample("transition")`

: Loads the Telling Stories of Transitions dataset, containing metadata and codes only, due to the sensitive nature of the underlying text`loadExample("toy")`

: Loads a minimal toy example, reproduced below

```
Group,Convo,Unit,Line,A,B,C
Red,1,X,1,0,0,1
Red,1,Y,2,1,0,0
Blue,1,Z,3,0,1,1
Blue,1,W,4,0,0,0
Red,1,X,5,0,0,1
Red,2,X,1,1,0,0
Red,2,Y,2,1,0,0
Blue,2,Z,3,0,1,1
Blue,2,W,4,0,0,0
Red,2,X,5,1,0,0
```

**Loading Your Own Data**

To load your own datasets, use `DataFrame`

and `CSV.File`

, which requires the `DataFrames`

and `CSV`

packages

```
using Pkg
Pkg.add("DataFrames")
using DataFrames
Pkg.add("CSV")
using CSV
data = DataFrame(CSV.File("filename_here.csv"))
```

## deriveAnyCode!

`Main.EpistemicNetworkAnalysis.deriveAnyCode!`

— Function```
deriveAnyCode!(
data::DataFrame,
newColumnName::Symbol,
oldColumnNames...
)
```

Add a new code column to `data`

, derived from existing codes. The new code will be marked present where any of the old codes are present

**Example**

`deriveAnyCode!(data, :Food, :Hamburgers, :Salads, :Cereal)`

## deriveAllCode!

`Main.EpistemicNetworkAnalysis.deriveAllCode!`

— Function```
deriveAllCode!(
data::DataFrame,
newColumnName::Symbol,
oldColumnNames...
)
```

Add a new code column to `data`

, derived from existing codes. The new code will be marked present only where all of the old codes are present on the same line

**Example**

`deriveAllCode!(data, :ObservingStudentsLearning, :Observing, :Students, :Learning)`

## to_xlsx

`Main.EpistemicNetworkAnalysis.to_xlsx`

— Function`to_xlsx(filename::AbstractString, model::AbstractENAModel)`

Save a model to the disk as an Excel spreadsheet, useful for sharing results with others

See also serialize for saving models in a more efficient format that can be reloaded into Julia using deserialize

Note: a `from_xlsx`

function does not exist, but is planned. The difficulty is that Excel data is at root a human-readable string format, and some components of some models are difficult to represent reliably as human-readable strings

## pointcloud

`Main.EpistemicNetworkAnalysis.pointcloud`

— Function```
pointcloud(
model::AbstractENAModel;
ndims::Int=nrow(model.points),
mode::Symbol=:wide,
z_norm::Bool=false,
metadata::Vector{Symbol}=Symbol[]
)
```

Produce a point cloud matrix from a model's plotted points and optional additional metadata columns, for preparing data to pass to other packages, e.g., for machine learning.

**Arguments**

Required:

`model`

: The ENA model to produce a point cloud from

Optional:

`ndims`

: The number of dimensions from the ENA model's embedding to include in the point cloud. The first`ndim`

dimensions will be included. By default, all dimensions will be included`mode`

: The orientation of the point cloud, either in`:wide`

format (default) or`:tall`

format. In wide format, the point cloud's`X`

matrix's rows will correspond to features. In tall format, they will correspond to units.`z_norm`

: Whether to normalize the point cloud's features (default: false)`metadata`

: A list of additional names of metadata columns from the model to include in the point cloud. Note, when including additional metadata, it is advised to also set`z_norm`

to true

**Fields**

Once the point cloud is constructed, it will have the following fields:

`X`

: A matrix containing the point cloud data, in either wide or tall format`feature_names`

: A vector of the names of the features included in the point cloud. When`mode`

is`:wide`

,`feature_names`

corresponds to the rows of`X`

. When`mode`

is`:tall`

, it corresponds to the columns of`X`

instead.`unit_names`

: A vector of the IDs of the units included in the point cloud. When`mode`

is`:wide`

,`unit_names`

corresponds to the columns of`X`

. When`mode`

is`:tall`

, it corresponds to the rows of`X`

instead.`z_normed`

: A boolean representing whether the point cloud was normalized`z_means`

and`z_stds`

: When`z_normed`

is true, these are vectors of the original means and standard deviations of the features of the point cloud

**Example**

```
# Wide format DataFrame
pc = pointcloud(model)
df = DataFrame(pc.X, pc.unit_names)
# Tall format DataFrame
pc = pointcloud(model, mode=:tall)
df = DataFrame(pc.X, pc.feature_names)
# ndims, metadata, and z_norm
pc = pointcloud(model, ndims=4, mode=:tall, metadata=[:Act], z_norm=true)
df = DataFrame(pc.X, pc.feature_names)
```