TLDR: Scheduled refresh will queue your dataset for processing with datasets of other customers (scheduled at the same time) and will be processed when resources are available. Scheduled time can differ from actual processing time up to 1 hour. This delay is not present when refresh is triggered by API call which can serve as workaround.
Couple of days ago I published operational report to Power BI Service and set scheduled refresh to keep it updated throughout the working hours with option to refresh it on demand when necessary. Since then I noticed that most of the time I want to refresh my report manually, it is in state of currently running refresh. When I checked the refresh times, the duration ranged between 10 to 20 minutes. Strange. When I tried to refresh it manually the duration never exceeded 1 minute.
This led me to investigation process of identifying problematic query or data source. After few tries of isolating queries I found out that the issue is occurring only when the dataset is ran by Scheduled refresh mechanism. Google helped me out to find the correct page in Microsoft Docs with short note describing that this behavior is a feature:
Note also that the configured refresh time might not be the exact time when Power BI starts the next scheduled process. Power BI starts scheduled refreshes on a best effort basis. The target is to initiate the refresh within 15 minutes of the scheduled time slot, but a delay of up to one hour can occur if the service can’t allocate the required resources sooner.
https://docs.microsoft.com/en-us/power-bi/connect-data/refresh-data#configure-scheduled-refresh
Fine, I got it – we are on shared capacity and we can only set refresh times every 30 minutes, so that must cause peek processing times. I just wonder why we cannot spread the schedule times more evenly and maybe get rid of these peeks naturally? Or why we don’t see the queue time duration and be actually able to start on demand refresh while we are waiting? Because currently you are just blocked by the scheduled refresh and have to wait.
So how can we programmatically start the on-demand refresh to skip the queue “feature”? One pretty easy solution is to create pipeline in Azure Data Factory which will trigger dataset refresh with a call to Power BI API. Then we can create scheduled runs as we need it (8 refreshes per day still applies) and get real refresh duration for our datasets.
If you do have another solution, please write a comment below and share your workaround.