SAS2PY – Automated workload migration

Over 90% effort reduction using automation.

Seamless automated rewrite of your SAS code ETLs from legacy to Snowflake/BigQuery/Synapse

SAS to Pandas, Dask, Modin, PySpark, SAS*, Snowpark.
SQLs from Teradata, DB2, Oracle, Netezza to Snowflake, BigQuery, Synapse and more

THE PROBLEM

Manually Converting
SAS Code to Python

THE SOLUTION

Automatic conversion of code written in SAS language to open source Python 3.X based pandas or Pyspark language
Typical use cases
  • API driven
  • Run Anywhere (Docker)
  • Scalable using Docker Container
  • Collaborative Platform
  • Built-in git Integration
  • Syntax Highlighting
  • Jupyter Notebooks
  • Inbuilt Documentation PEP257 of all generated code
  • Consistent PEP8 , OOPS for world standard coding style
  • Data Lineage for impact analysis (all sas code including SQLs)
  • Integration with many platforms
  • SDK to extend the functionality

Design and development use latest technology

And in general the content of dummy text is nonsensical. used to demonstrate the appearance of different typefaces and layouts, and in general the content of dummy text is nonsensical. Dummy text is also used to demonstrate the appearance of different typefaces and layouts

Artificial intelligence use in automobile industry

Used to demonstrate the appearance of different typefaces and layouts, and in general the content of dummy text is nonsensical. text is also used to demonstrate the appearance of different typefaces and layouts, and in general the content of dummy text is nonsensical

machine learning platform used cloud deployment

Dummy text is also used. used to demonstrate the appearance of different typefaces and layouts, and in general the content of dummy text is nonsensica to demonstrate the appearance of different typefaces and layouts, and in general the content of dummy text is nonsensical.

MIGRATION STRATEGY

KEY ADVANTAGES

ROI & Cost Savings

Immediate ROI & Cost savings on SAS Licensing.

Upskill

Ability to upskill exisiting data scientists to the new age of AI dominated by Python based technologies.

Conversion Speed

Atleast 3x-10x increase in speed of conversion.

Cloud Migration

Cloud Migration is a breeze with our in-built support for all 3 public cloud platforms as well as Databricks, Snowflake and other platforms.

Automated Conversion

90% guaranteed out-of-box automated conversion saves significant time and money and lets you focus on business value and strategy rather than the manual work over months/years.

Phases of
Migration

Our consultants opt in to the projects they genuinely want to work on.

All solution

ANALYZE

Analyze, summarize and catalog the SAS code

CONVERT

Generate Pandas & PySpark code

VALIDATE

Unit tests, Data Validations

INTEGRATE

Cloud, Integration, Database, Connectivity, Inputs & Outputs

OPERATE

Training, Transition, Job, Orchestration, Support

TIERS OF SERVICE

Standard/Convert Tier

Convert SAS to Python by taking a SAS script and giving a Python equivalent with test data and documented code.

Premium/Integrate Tier

In addition to Basic Tier and cloud migration as well as MLOps adoption on AWS, Azure or Google cloud. Also supported are integrations with Databricks , H2O, Auger, Dataiku, DataRobot etc.

Enterprise/Unify Tier

In addition to Premium Tier, existing python code/modules are also migrated to cloud and integrated into MLOps. We can also offer industry best practices and help in strategically positioning your AI adoption.

Analyzer

Complete indepth analysis of your SAS artifacts to show you what you have.

Lineage

Static code analysis and data lineage of all code including SQLs

Converter

Automatic conversion from SAS to PEP8 complaint Python (Pandas or PySpark).

Jupyter Integration

Seamless integration with Jupyter auto generating Notebooks

Documentation

PEP257 compliant self documentation for all generated code

Reconciliation & Validation Framework

Row/Column validation framework for data reconciliation

ADVANTAGES OVER OTHER PRODUCTS

SAS2PY
  • Docker based product 100% API Driven
  • Nothing leaves your network
  • Comprehensive Conversion Analysis in your environment. 100k Lines of code in 10 minutes.
  • SAS to both Pandas & PySpark in same product
  • Over 50 SAS Procs are migrated including ML, Forecasting, Regressions, Stats

  • ETL migration from Teradata, IBM DB2 to Snowflake / Bigquery/ Synapse. Even Teradata ETL etc can be mapped to native spark or pandas
  • Guaranteed 90% automation using machine compiler/transpiler.
  • Static Analysis and Lineage of Data throughout the code including Datawarehouse DDLs and SQLs. Also integrates with OrientDB and other graph databases for enterprise lineage
  • Self testing capability to autogenerate its own data per each SAS step and help to remediate the execution.
  • PEP8 standards for both functions and OOPS/class wise code generation to generate the industry standard code.
  • Entire code is self documented PEP257 and we have documentation generated out of the box for all code generated.
  • Integrate with Streamsets/Informatica/Databricks/Dataiku etc automatically using SDK and APIs.
  • Not tied to SAS language processing hence processes DDLs, DMLs from variety of datasources like SQL, Oracle, Teradata, Impala, Hive, DB2 etc and generates lineage and ETL code in Pandas/PySpark
  • Emphasis on model scoring and deployment using sagemaker, mlflow etc. HP4Score is an important example of such a SAS proc
  • No proprietary runtime of any sort and all generated code is all open source and documented.
  • Upskill existing SAS resources using your favorite editor- scales to 100s of SAS programmers.
OTHERS
  • Mainly reliant on Conversion Services
  • Code and or Data need to be sent out
  • Code discovery only shows some metrics not the actual conversion confidence.
  • Either Pandas or PySpark but never both
  • Limited to basic support like sort , means, transpose, basic ML which can be accomplished by using pyspark API only
  • Only processes spark sql and does not have SQL transpiler so relies on copy paste sql strings without flexibility on target architecture
  • Automation limited to some ETL Data/Macro and some basic PROCs
  • Data Lineage not available


  • No self testing available so its all dependent on data being provided for all testing.

  • No OOPS/Class wise generation available and code looks basic in structure with global variables and functions (not recommended as a standard).
  • Self documentation is not available.

  • Limited integration with specific technologies only. No SDK available.
  • Just processes ETL and related code only.


  • HP4Score or any procs in SAS used in this regard are not supported.

  • Uses runtime to help with execution which makes this non-open source.
  • No Editor or scalable to support any upskilling of resources.

TESTING & VALIDATION

Testing and Validation are fundamental aspects of any migration. SAS2PY comes with Self-testing python code without requiring data to be given. Also available is the validation framework for reconciliation with SAS generated outputs and the outputs from Python code.