Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Contents


Anchor
meet
meet
Meet Paxata

What you can do with Paxata

Paxata provides a clean, familiar, spreadsheet-like feel. The challenge of prepping data is simplified to single clicks for each action. This provides a point-and-click experience that empowers you to quickly gather data, simply explore and prepare it, and then easily share it.


Quickly gather all your data into Paxata’s LibrarySimply prepare your data in a Paxata Project – with clicks, not codeEasily publish your work as a Paxata AnswerSetfor reliable analytics


Video overview of Paxata

The below video provides a high-level overview of how you will use Paxata.

...

ElementFunction
Account MenuAccess account specific options like updating your password or logging out.
Help ToggleShow or hide the Help Panel
Help PanelGet helpful information related to the current screen
Navigation MenuNavigate between the screens used to perform specific actions in Paxata. The primary screens are:
  • Library, where you access your imported and published data
  • Projects, where you prepare your data
  • Admin, where connections to data sources are made and users permissions are controlled
  • Automation, where details for automated datasets and Projects are provided
Note: The screens available to each user are based on the user's permissions
Notification BellKnow when Paxata encounters a warning or error

...

The Library is where you gather your data. Data, like an Excel spreadsheet, that is imported into the Library is called a dataset. Once you have imported a dataset, you can begin prepping your data in a Project. See the Prep your data in a Project section of this article. When you have finished prepping your data, you can publish it back to the Library as an AnswerSet. See the Share your prepped data as an AnswerSet section of this article.

Sources of datasets


Datasets can be imported from local files on your computer or from connected data sources. Some examples of connected data sources are:

  • Cloud storage like Amazon S3
  • The Hadoop Distributed File System (HDFS)
  • Relational databases like MySQL
  • Secure File Transfer Protocol (SFTP)

...

Follow these steps to import a dataset from your computer:

StepAction
1

On the Library screen, click + import 
 
Result: The Import Data screen appears

2Click + Upload local file 
Result: The Upload local file panel appears
3To upload a file,
  • Click the Upload local file panel and select the file or
  • Drag-and-drop the file into the Upload local file panel
Result: The Parsing screen appears. The Parsing displays a preview of how Paxata structured your data
4

Check the preview. Does your data look correct? 

If ...Then ...
YesContinue to the next step
NoTry adjusting the import options


5Click Finish 
Result: Your data is imported as a dataset and ready to be prepared

...

ActionDescription

Explore your data using dynamic visuals to highlight patterns, duplicates, blanks, errors and missing data.

Clean your data by standardizing values, removing duplicates, finding and fixing errors, and more.

Shape your data using tools like pivot, transpose, group by and more

Combine additional datasets to enrich your data and provide more context.

...

StepAction
1

On the Projects screen, click + add 
 
Result: The Start a New Project window appears

2Enter a name for your Project in the Name field
3To save your new Project and start prepping your data
  • Click Save and Open 
    Result: The Start with a dataset screen appears. This is where you select the base dataset. A base dataset forms the foundation of your Project. It is the data on which all other actions in the Project will be performed.
  • Are you able to locate the dataset you want to add?


    If ...  
    Then ...
    YesNext to the dataset, click SELECT
    No
    Click Datasets + import button at the top of the screen to import the dataset you want to use.


  • Result: The Project opens with the selected base dataset, which is now ready for preparation.

...

OperationDescription
FILTERThe FILTER operation combines functionality of filters with the power of histograms. The result is called a Filtergram. With a Filtergram, you see the relative frequency of each value in a column and select values to temporarily hide some of your data. See the Data Filtergrams article
CHANGEThe CHANGE operations allow you to standardize the values in a column. For example, you could change all numbers in a column to numeric values
COLUMNThe COLUMN operations allows you to make changes to the column of data. You can do things like:
  • Split values into multiple columns based on a delimiter character or a given number of characters. See the Split Column article
  • Find and replace specified values in the column
  • Duplicate the column
WHITESPACEThe WHITESPACE operations allow you to remove leading and trailing spaces as well as extra spaces within your data
OTHERThe OTHER operation is Cluster + Edit. This operation allows you to find values in a column that are similar and edit them so they are the same 
Example: Before using Cluster + Edit, Apple Computer appears in a column as "Apple Computer", "Apple Corporation", and "Apple Computer Corporation". After using Cluster + Edit all instances for Apple Computer can be standardized to "Apple Computer" 
See the Cluster and Edit article.

...

ToolNameDescription

highlightThe more data you have, the harder it can be to notice small details. The highlight tools provide visual cues to help you see:
  • Patterns
  • Spaces
  • Ranges

attachWhen you need to add additional datasets to your Project, use the attach tool. Rows of data can be added to the bottom of your Project. If your datasets have a matching column of data, the additional data can be combined with the data in the Project. See the Lookup article

columnsSometimes you may want to make minor adjustments to your columns. The columns tools let you:
  • Edit your column names
  • Rearrange your column order
  • Remove columns

computeThere may be a time when you need to write an expression. Maybe you want to concatenate data from multiple cells into one value, or perform mathematical operation based on data. The compute tool is how you do that. See the Computed Columns article

removePart of cleaning data is removing information that is not needed. The Remove tool lets you remove rows of data. See the Remove Rows article

samplingYou may find it useful to work with a sample of a dataset before bringing all the data into your Project. For large datasets, this can make initial exploration and discovery easier. The sampling tool also gives you the flexibility to filter down to a specific set of rows in your data, and then sample on the remainder

shapeChange shape of your data using the Shape tools. With these tools, you can:
  • Deduplicate
  • Group data
  • Pivot
  • Depivot
  • Transpose
See the Data Shaping Tools article

auto #The auto # tool assigns each row a number. This is helpful if you need to give each row a unique identifier

new lensLenses create publishing points from Steps in your Project. When you publish from a Lens, the resulting AnswerSet is a snapshot at a particular Step in your Project. The AnswerSet is saved to the Library. Lenses are also essential for Project Automation because they define the publishing points to use for automated jobs. See the Project Lenses article

...

When you're ready to save and share the data you prepped, publish it to the Library as a AnswerSet. An AnswerSet is like a dataset. The difference is an AnswerSet is the published result of your data prep. Once published, you can reuse the AnswerSet in other Projects or export the AnswerSet to share with other applications. Your published AnswerSet is always published to the Library. AnswerSets can also be created at any time and for any set of specific Steps in your Project using a Lens.

...

Follow these steps to publish an AnswerSet:

StepAction
1Click steps in the TOOLS menu 
Result: The Steps Editor panel opens
2Click the step you want to publish an AnswerSet from 
Note: Paxata defaults to the last step in the Project
3At the top of the Steps Editor, click Publish 
Result: The Publish AnswerSet to Library window appears
4Enter a name for the AnswerSet in the Name field
5Click Publish 
Result: The Publishing AnswerSet message appears. Paxata publishes an AnswerSet using the steps up to and including the selected step. The AnswerSet is published to the Library

...

Anchor
export
export
Export your prepped data

Export overview

Datasets and AnswerSets can be exported out of Paxata. Exporting amplifies your ability to get the most out of your data.

...

Follow these steps to download an AnswerSet or dataset locally:

StepAction
1On the Library screen, hover your mouse over the AnswerSet you want to export
2

Click the Export button that displays
Result: The Exporting screen appears

3Click Download file locally 
Result: The Export Settings screen appears
4Click Export 
Result: The AnswerSet is downloaded to your computer as a comma separated values file. The Export Logscreen appears

...

TermDefinition
AnswerSetLike a dataset except that it is the published result of your data prep
Base datasetThe data on which all other action in the Project will be performed
Data sourceThe source of your dataset
DatasetData that is imported into the Library is called a dataset
FiltergramThe combination of the functionality of filters with the power of histograms

...