What you can do with Paxata
Video overview of Paxata
Tour the basics of Paxata
Sources of datasets
Import a local file
Tour the Library screen
Video overview of Projects
Start a new Project
Tour the Projects screen
Overview of column operations
Overview of TOOLS
Publish an AnswerSet
Export your AnswerSet to your computer
What you can do with Paxata
Paxata provides a clean, familiar, spreadsheet-like feel. The challenge of prepping data is simplified to single clicks for each action. This provides a point-and-click experience that empowers you to quickly gather data, simply explore and prepare it, and then easily share it.
|Quickly gather all your data into Paxata’s Library
|Simply prepare your data in a Paxata Project – with clicks, not code
|Easily publish your work as a Paxata AnswerSet™for reliable analytics
Video overview of Paxata
The below video provides a high-level overview of how you will use Paxata.
|Access account specific options like updating your password or logging out.
|Show or hide the Help Panel
|Get helpful information related to the current screen
|Navigate between the screens used to perform specific actions in Paxata. The primary screens are:
|Know when Paxata encounters a warning or error
The Library is where you gather your data. Data, like an Excel spreadsheet, that is imported into the Library is called a dataset. Once you have imported a dataset, you can begin prepping your data in a Project. See the Prep your data in a Project section of this article. When you have finished prepping your data, you can publish it back to the Library as an AnswerSet. See the Share your prepped data as an AnswerSet section of this article.
Sources of datasets
Datasets can be imported from local files on your computer or from connected data sources. Some examples of connected data sources are:
- Cloud storage like Amazon S3
- The Hadoop Distributed File System (HDFS)
- Relational databases like MySQL
- Secure File Transfer Protocol (SFTP)
Follow these steps to import a dataset from your computer:
On the Library screen, click + import
|Click + Upload local file
Result: The Upload local file panel appears
|To upload a file,
Check the preview. Does your data look correct?
Result: Your data is imported as a dataset and ready to be prepared
|Explore your data using dynamic visuals to highlight patterns, duplicates, blanks, errors and missing data.
|Clean your data by standardizing values, removing duplicates, finding and fixing errors, and more.
|Shape your data using tools like pivot, transpose, group by and more
|Combine additional datasets to enrich your data and provide more context.
On the Projects screen, click + add
|Enter a name for your Project in the Name field
|To save your new Project and start prepping your data
|The FILTER operation combines functionality of filters with the power of histograms. The result is called a Filtergram™. With a Filtergram, you see the relative frequency of each value in a column and select values to temporarily hide some of your data. See the Data Filtergrams article
|The CHANGE operations allow you to standardize the values in a column. For example, you could change all numbers in a column to numeric values
|The COLUMN operations allows you to make changes to the column of data. You can do things like:
|The WHITESPACE operations allow you to remove leading and trailing spaces as well as extra spaces within your data
|The OTHER operation is Cluster + Edit. This operation allows you to find values in a column that are similar and edit them so they are the same
Example: Before using Cluster + Edit, Apple Computer appears in a column as "Apple Computer", "Apple Corporation", and "Apple Computer Corporation". After using Cluster + Edit all instances for Apple Computer can be standardized to "Apple Computer"
See the Cluster and Edit article.
|The more data you have, the harder it can be to notice small details. The highlight tools provide visual cues to help you see:
|When you need to add additional datasets to your Project, use the attach tool. Rows of data can be added to the bottom of your Project. If your datasets have a matching column of data, the additional data can be combined with the data in the Project. See the Lookup article
|Sometimes you may want to make minor adjustments to your columns. The columns tools let you:
|There may be a time when you need to write an expression. Maybe you want to concatenate data from multiple cells into one value, or perform mathematical operation based on data. The compute tool is how you do that. See the Computed Columns article
|Part of cleaning data is removing information that is not needed. The Remove tool lets you remove rows of data. See the Remove Rows article
|You may find it useful to work with a sample of a dataset before bringing all the data into your Project. For large datasets, this can make initial exploration and discovery easier. The sampling tool also gives you the flexibility to filter down to a specific set of rows in your data, and then sample on the remainder
|Change shape of your data using the Shape tools. With these tools, you can:
|The auto # tool assigns each row a number. This is helpful if you need to give each row a unique identifier
|Lenses create publishing points from Steps in your Project. When you publish from a Lens, the resulting AnswerSet is a snapshot at a particular Step in your Project. The AnswerSet is saved to the Library. Lenses are also essential for Project Automation because they define the publishing points to use for automated jobs. See the Project Lenses article
When you're ready to save and share the data you prepped, publish it to the Library as a AnswerSet. An AnswerSet is like a dataset. The difference is an AnswerSet is the published result of your data prep. Once published, you can reuse the AnswerSet in other Projects or export the AnswerSet to share with other applications. Your published AnswerSet is always published to the Library. AnswerSets can also be created at any time and for any set of specific Steps in your Project using a Lens.
Follow these steps to publish an AnswerSet:
|Click steps in the TOOLS menu
Result: The Steps Editor panel opens
|Click the step you want to publish an AnswerSet from
Note: Paxata defaults to the last step in the Project
|At the top of the Steps Editor, click Publish
Result: The Publish AnswerSet to Library window appears
|Enter a name for the AnswerSet in the Name field
Result: The Publishing AnswerSet message appears. Paxata publishes an AnswerSet using the steps up to and including the selected step. The AnswerSet is published to the Library
Datasets and AnswerSets can be exported out of Paxata. Exporting amplifies your ability to get the most out of your data.
Follow these steps to download an AnswerSet or dataset locally:
|On the Library screen, hover your mouse over the AnswerSet you want to export
Click the Export button that displays
|Click Download file locally
Result: The Export Settings screen appears
Result: The AnswerSet is downloaded to your computer as a comma separated values file. The Export Logscreen appears
|Like a dataset except that it is the published result of your data prep
|The data on which all other action in the Project will be performed
|The source of your dataset
|Data that is imported into the Library is called a dataset
|The combination of the functionality of filters with the power of histograms