Bulk Operations - Best Practices
Here are some tips make the most of this powerful feature while being mindful of the load required.
Bulk Ops Administration - Thinking Lean
- Leverage filter criteria in bulk operations to the fullest so that only the devices that truly need to receive a given bulk operation make it past the applied filters. Consider filtering on the Last Inform date to exclude devices that haven't reported in and sync'd with the ACS in more than a few days. For whatever reason, those likely represent stale device records in your system that aren't truly active gateways. Look for other creative filter criteria that help each bulk opt to run as lean and efficient as possible.
- Consider using 'passive' bulk operations whenever possible. Checking the Solicit Devices checkbox is only required when the nature of the operation is service affecting and therefore must run during non-peak hours specified in the Schedule of the bulk operation.
- For optimal performance, you might consider spreading ot scheduled bulk operations across different days of the week and different hours within your maintenance window. For instance, if three bulk ops that must run on Tuesday night during off hours, do not schedule all three from midnight to 2 am. Whenever possible, give each bulk operation its own separate window of time in which to run. Even if kicking off multiple passive bulk ops, consider spreading out the start dates for each.
- Delete old bulk operations that have run their course and served their purpose. Even though most or all of the records meeting the filter criteria may have been processed, the system still has to sift through the database for each bulk op query that is present in the system. This may cause unnecessary load on your instance and degradation of bulk op performance. There is no benefit to leaving old bulk ops behind so make it a point to maintain this regularly by reviewing and deleting them. *
Bulk Ops Strategy - Is it done yet?
There are a variety of tasks that a bulk op might be designed for -- too many to anticipate the time required per device. Different operations take more or less time to run: Some operations take about a second (or less) per device. Other operations take 4 or 5 seconds (worst case) per device. Multiply this by the device population (which also varies greatly between providers) and you begin to understand that there is no one-size-fits-all recommendation for how many days a bulk op should take to complete. Other variables unique to your instance and what features you are running also further complicate predictability.
* Here is one possible strategy to confirm completion of a bulk operation so that it can then be deleted:
- Once the bulk op has been carefully planned, entered into the platform, then launched, Stop the bulk op after a few days.
- If needed, use the Export to CSV feature to examine the results and tally the success of those first few days of the bulk op.
- Make a copy of the filter criteria query used for this bulk op, then delete the original bulk-operation.
- Create a new Bulk Op using the same query and launch it.
- Repeat the above steps and note the quantity of devices that did not complete the bulk op.
- Once the number of devices that don’t complete the bulkop in 48 hours becomes essentially constant, you’re done.
Also helpful is to set the Action of the bulk operation to be a script. That script should apply a Label to the device on success. The absence of that label should be a part of the query used in the device selection criteria for this bulk op.
Lastly, it is often overlooked that it’s very rare that every device in the query will complete the bulkop. The next step to catch the straggling devices must employ customer support – by either reaching out to each customer proactively or handling issues presented when the subscriber sees fit to call in. Many providers miss out on this as a necessary follow-up/fallback, but it must be a part of the system since there’s no way to ensure 100% success for any bulkop.
Controlling Maximum Throughput/Throttling
The Max Sessions field in bulk operations requires some insight in order to set this parameter properly. The proper setting will be different based on numerous factors. Considerations like Max Sessions and Session Duration are discussed in detail in the Creating Bulk Operations article.
How It Works
When a bulk operation starts, there are N number of free slots for bulk operations which can be set by entering a number in the Max Sessions field. The engine retrieves N CPEs from the result set (the list of CPEs that are part of the bulk operation) to solicit. It then solicits the CPEs. The system regularly checks for available slots and solicits an appropriate number of additional devices to receive the bulk op.
For passive bulk operations*, solicits don’t occur. Instead, when a CPE informs, if it’s in the result set derived from the Filter Criteria for that bulk op and there are free bulk operation slots, the bulk operation runs.
*Passive bulk operations refers to when the Solicit Devices checkbox is not checked.
Preparing to Run Firmware Update Operations
Preparation is the key to a successful bulk firmware update. There are several things to consider before updating to a newer firmware version:
- Make and model—Only one CPE type per bulk op can be selected, but you can schedule multiple bulk operations if multiple makes/models of devices require updates.
- Firmware version—Verify that the firmware version being updated from will go directly to the new version. If the CPEs to be updated are on a firmware release that is very far behind the current release, you might need to perform interim updates prior to updating to the current version. Contact the manufacturer of the affected CPEs with questions regarding this.
- Firmware definition—View the firmware definition on the Firmware window of the Administration tab. If any changes need to be made, edit the firmware definition.
- Field preparation—Verify that all CPEs in the selected group are informing.
- Test! Test! Test!—Always run a test update on a small group of CPEs before pushing the update to a large group of CPEs.
- Labels—If you have applied a time zone-related label to your devices, filtering by labels is useful when updating CPEs in multiple time zones.
- Solicit Devices: Performing a firmware update requires that the device reboots -- thus the operation is briefly service affecting. You'll want to select the Solicit Devices option and set a maintenance window when the fewest possible subscribers will be impacted by the update.
Troubleshooting Failed Bulk Operations
Bulk Operations rarely fail on all CPEs, if the operation has been set up correctly. However, there can be “failed actions” on a particular CPE. In this case, Device Manager does not attempt to rerun the action against that CPE.
If the operation is scheduled to run multiple times, the CPE might “enter the pool” of selected CPEs a second time. Otherwise, determine the reason the action failed and then run another bulk operation that includes that CPE.
RELATED ARTICLES:
Bulk Operations - Understanding Options