Data archiving: reducing amount of parallel batch jobs

When executing data archiving you have to be acting careful. The data archiving write and delete processes can be consuming a lot of CPU power from the database. Also, if you are not careful you might, by accident, claim all background processes. This blog will explain how to limit the amount of batch jobs used for data archiving. The data archiving run process itself is described in this blog.

Questions that will be answered in this blog are:

  • How can I limit the amount of deletion jobs?
  • How can I restrict the archiving jobs to run on a specific application server only?

Limit amount of deletion jobs

When the write run of data archiving is finished, this can have delivered many files. If you are not careful with the deletion, you select all files and each file will start a deletion run. This will consume a lot of CPU power on database level, since the deletion run will fire many DELETE statements to the database in rapid sequence. Also you might consume all batch jobs, leaving no room for any business batch job.

In stead of running the deletion from SARA, you can also run the deletion via program RSARCHD:

With this example, MM_EKKO files will be deleted. Maximum of 50 files from 1 archiving run will be processed, with a maximum of 2 deletion batch jobs running at the same time.

The general OSS note for this program is 133707 – Data archiving outside transaction SARA.

Relevant OSS notes bug fix notes:

General application server restrictions via batch job server group

In SM61 you can setup a special batch job server group. Here can assign a single application server for you data archiving batch job processing. We assume here you created a group called DATA_ARCH.

In SARA you can now goto the general data archiving settings:

Now you can link the batch job server group:

With the button JobClasses you can specify the job priorities per data archiving function:

A = high priority, C = low priority. The above screen shot is an example.

The second part of OSS note 2269004 – How to reduce parallel archiving jobs on Integration Engine describes the procedure as well. The first part of the note is only relevant for SAP PI.

Data archiving improvement notes 2018

In 2018 SAP ran an improvement project which resulted into a set of OSS notes that will make data archiving more robust and easy.

All of these notes come with manual work. Select the ones really useful.

Archiving write process improvements

Write variant maintenance has been made easier by allowing copying of variants (useful if you have many plants and company codes and want to store each one in different archive file): 2520093 – Archive administration: Enhanced variant maintenance (writing, preprocessing, and postprocessing).

To be able to detail the written file name of the archive file implement this oss note: 2637105 – Print list for archiving write jobs: Placeholders for session numbers, archive file key in title.

Archiving storage process improvements

Archiving system technical check button is available in OAC0, but not in SARA. After applying this note you can also check it in the technical settings in SARA: 2599263 – Connection test for storage systems for archiving object.

Deletion process improvements

To be able to quickly continue with interrupted archiving sessions apply this note 2520094 – Continue: Information on existence of interrupted or incomplete archiving sessions.

This note will implement checks to warn you about uncompleted previous store and delete runs: 2586921 – Run selection for deletion: Information about the existence of unstored archive files.

Some archiving object use the AIS (archiving information system) to enable the end user a quick retrieval of archiving information. This note will give warning before start of deletion if the AIS is note active for the object: 2624077 – Starting delete jobs: Check for active info structures.

Archiving overview and logging improvement

To get a better overall overview of all logs apply OSS note 2433546 – Archive administration logs: Information about errors in hierarchy display. Showing only success message is possible after applying OSS note 2855641 – Logs: New option “Success Messages Only” for detail log.

Direct navigation to Archive File Browser: apply OSS note 2544517 – Archive administration: Direct navigation to ArchiveFileBrowser. This note only gives you a link. You can already start the archive file browser using transaction AS_AFB:

Archive file browser

Note 2823924 – Archive File Browser: Messages that do not belong to the Archive File Browser are output solves a bug in the Archive File Browser.

SAP database growth control: data archiving business discussions

This blog addresses the main challenge in SAP data archiving for functional object: the discussions with the business.

This blog will give answers to the following questions:

  • When to start data archiving discussion with the business?
  • How to come to good retention periods?
  • What are arguments for not archiving certain data?

Data archiving discussion with the business

Unlike technical data deletion, functional data archiving cannot be done without proper business discussion and approval.

Depending on your business several aspects for data are important:

  • Auditing and Sox needs
  • Tax and legal retention periods
  • Product data requirement
  • And so on…..

Here are some rules of thumb you can use before considering to start up the business discussions about archiving:

Rule of thumb 1: the system is pretty new. At least wait 3 years to get an insight into which tables are growing fast and are worth to investigate for data archiving.
Rule of thumb 2: if your system is growing slowly, but the infrastructure capabilities grow faster: only perform technical clean up and don't even start functional data archiving.
Rule of thumb 3: if you are on HANA: check if the data aging concept for functional objects is stable enough and without bugs. Data aging does not require much work, it is only technical and it does not require much business discussions. Data retrieval from end user perspective is transparent.

Data analysis before starting the discussion

If your system is growing fast and/or you are getting performance complaints, then you need to do proper data analysis before starting any business discussion.

Start with proper analysis on the data. Use the TAANA tool to get insights into the data: how is the distribution of data per document type, per year, per plant/company code etc. If you want to propose retention period of let’s say 5 years, you can use the TAANA results to show what percentage of data you can move out of the database.

Secondly: if you have an idea on which data you want to archive, first execute a trial run on a recent production copy. There might be functional blocks that prevent you from archiving data (like not closed documents).

Third important factor is the ease of data retrieval. Some object have a nice simple data retrieval function, and some are really terrible. If the retrieval is good, the business will more easily accept a shorter retention period.

As last step you can start the business case: how much data will be saved (and how much money hence will be save) and how much performance would be gain. And how much time is needed to be invested for setting up, checking (testing!) and running the data archiving runs.

In practice data archiving business case is only present in very large systems of 5 TB and larger. This sizing tipping point changes in time as hardware gets cheaper and hourly manpower costs go up.

The discussion itself

Take must time in planning for the discussion itself. It is not uncommon that archiving discussions take over a year to complete. The better you are prepared the easier the discussion. It also helps to have a few real performance pain points to get solved via data archiving. There is normally a business owner for this pain point who can help push data archiving.

SAP database growth control: data archiving run

This blog will explain how to execute a data archiving run.

Questions that will be answered in this blog are:

  • Which settings do I need to make or check before data archiving run?
  • How to perform the data archiving run?
  • How to validate the data archiving run?
  • How to retrieve that archived data?

This blog assumes you have finished the basic technical data archiving setup as described in this blog.

Functional data archiving example: purchase requisitions

To explain the functional data archiving we will use Purchase Requisitions as example. Technical object name is MM_EBAN.

Start screen SARA MM_EBAN

To see which tables are archived hit the Database Tables button. Here you can see the list of tables from which data potentially be archived:

Data base tables MM_EBAN

If you want to see the other way around, which table is used in archiving objects, do put in the table as entry point, to retrieve list of archiving objects. In this example archiving objects that delete from table EBAN:

Tables that archive EBAN

Dependency of objects

By clicking the top left button on the archiving object you get the archiving dependency view. For MM_EBAN this is pretty simple: it has no dependencies.

As example for dependencies this is the overview for sales orders (SD_VBAK):

SD_VBAK dependency overview

Here you can see that before you can archive sales orders, you should archive the billing documents first. And for the billing documents, you should archive the deliveries first.

Functional archiving settings

First we have to make or check the object specific functional archiving settings.

Application specific customizing

In the case of purchase requisitions we have to set the retention periods per document type:

Set application specific residence times

Pre-processing step

Some archive object have a pre-processing step. MM_EBAN has one as well. In this step data is selected and marked for archiving (many times by setting deletion flag or other indicator).

MM_EBAN preprocessing

In the step create the variant (give it a useful name) by putting in the name and pressing Edit. On the next screen fill out your data select the log level. Go back to the first screen and select the start data and spool parameters. When both lights are green, hit the execute button. When you click the job log button you check for the results.

Example of result of pre-processing run:

Preprocessing result

As you can see not all selected data is archived. Transactions that are not completed from business point of view will not be flagged for archiving.

Write run

If you have done the pre-processing step, continue with the write step. Principle is the same: select the data and log level. Important in the write step is to correctly fill the Archiving Session Note with a useful text. This text is put as label on the archive file for later retrieval:

Archiving session note

When done plan the job and execute. Result looks like:

Write summary result

Pending on your technical system settings the file will be stored automatically or you still need to do this manually.

Storage run

If you have setup the system to store files in content server, you first have to execute storage run. For more details see this dedicated blog.

Deletion run

Finally we can now start the deletion run: the actual clean up of old data happens now.

Select the data files you want to archive and start the run.

Word of care with deletion: please don't select too much files and subsection in one go. Each file sub section will result into a deletion job. The deletion will put significant load on the database, since it will be pushing out a lot of data. If you are not careful you will launch easily 20 or more heavy deletion jobs that run in parallel and that might severely decrease system performance.

Result of archiving deletion run:

Deletion result

Checking archive result

The result checking is possible by looking at the technical correctness of the archive file.

In the archiving object choose the Overview button. Then select the archive file you want to inspect. A correct file should like like this:

Archive administration

In the testing phases and first production runs, you also want to do record counting. A good way is to run the TAANA transaction for key tables you want to archive before the archiving and after the archiving. The difference should match the deletion counter on the write and deletion logs. If you find differences: check for bug fix OSS notes.

Data retrieval

Retrieving archived data is different per archived object. Some retrieval is nicely integrated into the normal transaction. Some require extra transaction to run. Some retrieval is via special program.

Data retrieval of purchase requisitions can be done via SARA and choosing the read option.

Here you first need to manually select the archive files to read from (see I did not give the note and regret it, since the file has no meaning now…):

Select files for read program

Result after reading looks like this:

Read program result

SAP database growth control: data archiving general setup

This blog will explain the general technical setup to be performed for SAP data archiving.

Questions that will be answered in this blog are:

  • Which generic settings do I need to make for data archiving in the technology domain?
  • Why should I use a content server to store archive files?

Data archiving content server setup

For data archiving you can use the file system for storing the archive files. This you can do to perform initial testing. For productive use it is best to store the archive files in a content server. It will not be the first time an overzealous basis person in need for file storage deletes some old files in a directory called /archive…..

After you install the content server, set up in OAC0 the customizing for the content server to use it for Archivelink:

OAC0 define content server

More details are explained in OSS note 2452889 – Assign a content repository to an Archiving Object.

For more details see this dedicated blog.

Data archiving general technical settings

Now start transaction SARA:

SARA start screen

In this initial screen no object is selected. Now press the Customizing button.

Data archiving customizing

Set the Cross-Client File Names/Paths to your needs. You can do that from this menu, or directly from the FILE transaction.

Set the physical path name to be used:

ARCHIVE_GLOBAL_PATH FILE name

Even when you use content server the file will first be written to physical path for temporary storage.

And check the archive file name:

ARCHIVE FILE name

Technical settings per archiving object

Per archiving object you can set the technical settings. Normally you keep settings the same per object. Only for very large installations with archiving or special needs, you might want to deviate.

In the technical settings per data archiving object set the following:

Data archiving technical customizing per object

Important settings to set:

  • Max size in MB or the max objects
  • Check the variants (some variants for production have still deliberately the test tick box as on: you have to change it)
  • Best to leave the delete jobs to Not scheduled (large archiving runs can create many files and many deletion jobs to kick in at the same time): best to do this manually in controlled way
  • Start storage automatically or manually is a choice for you
  • Best to store before deletion. This is the most conservative setting.
  • Best to delete only from storage system: if file is not stored properly in any way, deletion will not have. This is the most conservative setting.

2018 improvement notes on Data Archiving

In 2018 SAP released several improvement OSS notes on data archiving. Description can be found in this blog.

Controlling amount of parallel batch jobs

The deletion phase of archiving can lead to uncontrolled amount of parallel batch jobs. See this dedicated blog on how you can control it.

Print list archiving

This blog will explain how to setup print list archiving.

Questions that will be answered are:

  • What is use case of print list archiving?
  • How to setup print list archiving?
  • How to test print list archiving?
  • How to troubleshoot issues with print list archiving?

Goal of print list archiving

The business sometimes needs to store report output for a longer period of time. They can print the information and put it in their archive. This leads to a big physical archive.

You can also give the business the option store their output electronically in the SAP content server.

Set up or check content repository

First check which content repository you want to use to store the print lists. The type of content repository must be “ARCHLINK”. Menu path in customizing is as follows:

Set up content repository

Or you can go there directly with transaction OAC0.

Content repository A2 is default present in the system and is used in the example below. A2 is pointing towards the SAP database for storage. For productive use a SAP content server in stead of SAP database.

Customizing for print list archiving

In the following customizing path you find all the actions required for the print list archiving:

Print list archivng customizing

First check that print list document type D01 is present and is using ALF as document class:

Print list document type

In the Edit links section, you can set for document type D01 which content repository is should use.

Print list to content repository link

Then check if the number ranges for archivelink are properly maintained (if empty create new number range):

Archivelink number ranges

Then activate the print list queues:

Setup print list archive queues

Next step is to select the action to schedule the storage job. This job should not run faster than every 15 minutes.

Final step is to setup the archive printer. You can later on see it with transaction SPAD as well.

Important here: short name must be ARCH. Device type and device class must be set to archiving.

Set up archive printer screen 1

On the access method tab also set access method to archiving.

Set up archive printer screen 2

Now the setup is complete.

Testing print list archiving

The test procedure is described in OSS note 1792336 – Test if a Print List is being Archived.

If you follow this procedure you will initially run into this strange screen:

Error screen

You didn’t do anything wrong yet. The problem is that the option for print to archive is not displayed by default. First go to the properties of a working printer to enable the archiving output option:

Print request properties screen

The rest of the note is self explaining:

  • Start SE38 and run program SHOWCOLO
  • Print the output list to printer ARCHIVE and archive mode selected
  • Goto SP01 find the spool, select menu path Print with changed parameters
  • Hit the Archive button
  • Start transaction OAM1 and hit the execute button next to Archive queue
  • Start transaction OADR to read from the archived print lists
  • From the list take the document and select the button “Display from storage system”

Troubleshooting

If you have issues, please check the troubleshooting OSS note  1775577 – How To and Troubleshooting guide for storing print lists in ArchiveLink.

SAP database growth control: technical cleanup

This blog will explain about technical cleanup to reduce the SAP database growth and to regain control of it.

Questions that will be answered are:

  • How to run the standard SAP clean up jobs?
  • Where can I find full list of items that could be cleaned up?
  • How to run the cleanup of some common objects?
  • Database reorganization after cleanup?
  • How can I clean up old idocs?
  • How can I clean up old table logging?
  • How can I clean up old application logs?
  • How can I clean up old RFC logs?
  • How can I clean up old change pointers?
  • How can I delete workflow logging?
  • How can I archive workflows?
  • How can I delete SAP office documents?

This blog assumes you have followed the step in the blog to get insight into your fast growing SAP tables.

If you run ECC on HANA or S4SHANA check out this blog on data aging.

This blog focuses on technical data objects archiving and clean up by performing deletion. If you want to setup functional archiving, start reading this blog.

List of technical clean up items

A full list of all possible technical clean up items can be found in OSS note 2388483 – How-To: Data Management for Technical Tables. The chapters below describe the most common ones.

SAP standard clean up jobs

Using SM36 you can plan all SAP standard jobs (which include a lot of clean up jobs for spools, dumps, etc) via the button Standard Jobs.

By hitting the button Default scheduling in an initial system, or after any upgrade or support package, the system will plan its default clean up schedule.

SM36 standard job scheduling

S4HANA has different set up of standard jobs. See blog.

Clean up of old idocs

Idoc data is stored in EDI* tables. Largest tables are usually EDI40, EDIDS and EDIDC.

Old idocs can be deleted using transaction WE11.

Idoc deletion

In batch mode you can schedule it as program RSETESTD.

In the bottom of the selection screen are the technical options:

Idoc deletion technical settings

The idoc deletion job can fail if there is too many data to process. If they happens remove the 4 tickboxes here and use the separate deletion programs: RSWWWIDE, RSARFCER, SBAL_DELETE and RSRLDREL2. These 5 combined programs will delete the same, but run more efficiently. This procedure is also explained in OSS note 1574016 – Deleting idocs with WE11/ RSETESTD.

Also check these OSS notes:

Clean up of table logging

Table logging is stored in table DBTABLOG. Deletion can be done using transaction SCU3 and then choosing the option Edit/Logs/Delete, or by using program RSTBPDEL.

After you apply OSS note 2535552 - SCU3: New authorization design for table logging: new transaction code SCU3_DEL will be available.

DBTABLOG deletion

More background information: OSS note 2335014 – DBTABLOG | Reduce size. Bug fix OSS notes:

Clean up of application logging

Application logging is stored in tables BALDAT and BALHDR. Deletion can be done using transaction SLG2 or by using program SBAL_DELETE.

The last options to fine tune the number of logs per job and the commit counter setting do not appear by default. Select menu option Program/Expert mode first.

Tuned setting for commit counter is described in OSS note 2507213 – SBAL_DELETE runs too long.

Delete old RFC data

Old RFC data can be deleted using transaction SM58, selecting some data, then in the overview screen select the menu option Log File/ Reorganize. Or by starting program RSARFCER.

More background information in OSS note 2899366 – Huge entries in table ARFCSDATA.

In this note you can also read to check SMQ1 as well, since qRFC’s are also stored in ARFCSDATA table. See blog on qRFC’s.

Delete old change pointers

Old change pointers occupy space in tables BDCP2 and BDCPS. You can use transaction BD22 or report RBDCPCLR/RBDCPCLR2 to delete them.

Delete change pointers

MDG change pointers

If you are using MDG: it has its own set of change pointer tables. Clean up transaction code is MDGCPDEL. Program for batch job clean up is RMDGCPCLR.

Workflows

Workflows are stored in many tables starting with SW*.

You can delete work item history with transaction SWWH or program RSWWHIDE.

Delete workflow item history

This clean up will only do the work item technical history and not the workflow itself. If workflow itself can be deleted or is to be archived is a functionality decision that the depend on the business and audit needs.

The workflow deleting program can create large amount of spools. If this is not wanted use the NULL printer.

If your business is using the GOS (generic object services) to see workflows linked to a business document, and they cannot retrieve the archived work item, please follow carefully the instructions in OSS note 2356250 – Not able to view archived workflows.

Workflow archiving

Workflow archiving can be done with archiving object WORKITEM. For archiving setup read this blog. This note explains how to run the archiving of the WORKITEM object: 2157048 – Workflow Quick Start Guide to WORKITEM Archiving. Data display for the archived workitems is explained in OSS note 2748817 – How to display Workitems from archive.

Bug fix OSS notes:

Workflow deletion

If you want to delete the actual workflow you have to run program RSWWWIDE.

Take care that before deleting workflows you have checked that these are not needed for audit or financial proof. Some workflows will contain approval steps with a recording of who approved what at which time.

Large amount of documents in SAP inbox

If you have a large amount of items in your SAP inbox, you can delete them via program RSSODLIN. Background is in OSS note 63912 – SAPoffice: Delete user sessions.

Deleting SAP office documents

SAP office documents can be deleted with program RSBCS_REORG. See note 966854 – Reorganization – new report. Note 988057 – Reorganization – information contains a very useful PDF document that explains what to do in cases that RSBCS_REORG is not directly can delete an SAP office document. In most cases you have to run a special program that breaks the link between the document and the data. After that is done you can delete the content.

Test this first and check with the data owner that the documents are no longer needed.

Bug fix OSS notes:

Change documents

Change documents do contain business data changes to business objects. If tables CDHDR and CDPOS grow very big, you start with an age analysis. You can propose to business to delete change documents older than 10 years. 10 years is the legal time you need to keep a lot of data. Deletion is done via program RSCDOK99. If business does not want to delete, but keep the data in the archive, you can use data archiving object CHANGEDOCU. Retrieval of archived change documents is via transaction RSSCD100.

LTEX table

LTEX table is used for storing ALV extracts data. Use program BALVEXTR to delete old entries. See OSS note 557772 – ALV extracts: Improving the BALVEXTR management report.

Updating statistics

If you are running Oracle database it is wise to include in technical clean up job as last step the online reorganization of tables or indexes using program RSANAORA. See blog.

SAP database growth control: getting insight

This blog will explain about getting insight into SAP database growth and controlling the growth.

Questions that will be answered are:

  • Do I have a database growth issue?
  • What are my largest tables?
  • How do I categorize my tables?

Why control database growth?

Controlling database growth has several reasons:

  • When converting to S/4 HANA you could end up with smaller physical HANA blade and need to buy less memory licenses from SAP
  • Less data storage leads to less costs (think also about production data copied back to acceptance, development and sandbox systems)
  • Back up / restore procedures are longer with large databases
  • Performance is better with smaller databases

Database growth

The most easy way to check if the database is growing too fast or not is using the Database Growth section in the SAP EWA (early watch alert). The EWA has both graphical and table representation for the growth:

EWA database growth picture

EWA database growth table

You now have to determine if the growth is acceptable or not. This depends a bit on the usage of the system, amount of users, business data, and if you already streched your infrastructure or not.

General rules of thumb: 

1. Growth < 1 GB/month: do not spend time.
2. Growth > 1 GB/month and < 5 GB/month: implement technical clean up.
3. Growth > 5 GB/month: implement technical clean up and check for functional archiving opportunities.

Which are my largest tables?

To find the largest tables and indexes in your system start transaction DB02. In here select the option Space/Segments/Detailed Analysis and select all tables larger than 1 GB (or 1000 MB):

DB02 selection of tables larger than 1 GB

Then wait for the results and sort the results by size:

DB02 sorted by size

You can also download the full list.

Analysis of the large  tables

Processing of the tables is usually done by starting with the largest tables first.

You can divide the tables in following categories:

  1. Technical data: deletion and clean up can be done (logging you don’t want any more like some idoc types, application logging older than 2 years, etc)
  2. Technical data: archiving or storing can be done (idocs you must store, but don’t need fast access to, attachments)
  3. Functional data: archiving might be done here

SAP data management guide

SAP has a best practice document called “Data Management Guide for
SAP Business Suite” or “DVM guide”. This document is updated every quarter to half year. The publication location is bit hidden by SAP under their DVM (data volume management) service. In the bottom here goto SAP support and open the How-to-guides section. Or search on google with the term “Data Management Guide for SAP Business Suite” (you might end up with a bit older version). The guide is giving you options per large table to delete and/or archive data.

Common technical objects

Most common technical tables you will come across:

  • EDIDC, EDIDS, EDI40: idocs
  • DBTABLOG: table changes
  • BALHDR, BALDAT: application logging
  • SWW* (all that start with SWW): workflow tables
  • SYS_LOB…..$$: attachments (office attachments and/or DB storage of attachments and/or GOS, global object services attachments)

Detailed table analysis for functional tables: TAANA tool

For detailed analysis on functional tables the TAANA (table analysis) tool can be used. Simply start transaction TAANA.

Now create a table analysis variant by giving the table name and selection of the analysis variant:

TAANA start screen

The default variant will only do a record count. Some tables (like BKPF in this example) come with a predefined ARCHIVE variant. This is most useful option. If this option does not fit your need, you can also push the create Ad Hoc Report button and define your own variant.

Caution: with the ad hoc variant select your fields with care, since the analysis will count all combinations of fields you select. Never select table key fields

Results of TAANA are visible after the TAANA batch job is finished.

TAANA result

By running the proper TAANA analysis for a large functional table you get insight into the distribution per year, company code, plant, document type etc. This will help you also estimate the benefits of archiving a specific object.

For TAANA improvement on dynamic subfields, please check this blog.

If you run on HANA, you can also use SE16H for the table analysis.

From analysis to action

For the technical clean up read the special blog on this topic. For functional objects, you need to find the relation from the table to the functional data archiving object. This relation and how to find it is clearly explained in OSS note 2607963 – How to find the relationship between table and archive object.

SAP data volume management via SAP solution manager

SAP is offering option to report on data volume management via SAP solution manager directly or as a subsection in the EWA. Experience so far with this: too long in setup, too buggy. The methods described above are much, much faster and you get insight into a matter of hours. The DVM setup will take you hours to do and days/weeks to wait for results….