Optimisation

Unsorted content

Following content needs a review and consolidation.

This page overviews some key areas for optimization.

Performance Optimisation

The goal is to improve the efficiency of PVGIS by optimizing data structures, refining algorithms, and implementing modern Python practices for asynchronous, concurrent, and parallel executions. Additionally, caching strategies and load balancing are essential for enhancing performance and scalability.

Status¶

The current Proof-of-Concept, (see commit : 5cca629ea186ff3c7711fbdbd8219841caf4d6b ) includes among other elements :

quite some constants (see constants.py)
print() statements for output and support debugging which is slowing down a programs runtime ¹
debugging calls, specifically debug(locals()) from devtools
input data validation using Pydantic
in-function output data validation
custom data classes/Pydantic models
use of lists and list comprehensions
frequently requested/repeated calculations
lack of caching/memoization practices
lack of :
- asynchronous executions,
- concurrent executions,
- parallel executions
  - no parallel processing beyond NumPy's own internals (?)
using Pandas' DatetimeIndex which is not hashable
no use of any external compiler or library for High-Performance Computing

Hence, the margin for optimisation is quite large.

Profiling¶

Before optimising, however, it is important to quantify performance bottlenecks.

Using profiling tools like cProfile for Python we can analyse and understand which parts of the code are consuming the most resources.

Areas for improvement¶

Out of common/public programmatic experience, documented in books, articles, software projects, publicly accessible wikis and fora, we can list ahead some areas for improvement and discuss possible optimisation actions.

Logging¶

Replace print statements and debug(locals()) with a structured logging framework like :
- Python's logging module
- loguru
- structlog
Remove print statements completely and return only JSON or other structured output through the Web API in the production version ?

Debugging¶

The debug(locals()) calls from devtools can be optimised (?) or removed completely in the production version to reduce overhead.

Data Validation¶

Avoid redundant checks
Consider removing/switching off the input data validation for the efficient Web API module(s). Albeit, after extensive validation of the fundamental algorithms, the core API and the CLI which can ensure quality and reproducibility of operations.

Example : Efficient Data Validation with Pydantic

from pydantic import BaseModel

class ExampleModel(BaseModel):
    attribute1: int
    attribute2: str

# Using Pydantic for validation
example = ExampleModel(attribute1=123, attribute2="test")

Use libraries developed in C/C++¶

There are numerous libraries/packages developed in C/C++ that can be integrated into Python programs. Numpy and Scipy are prominent examples, known for their effectiveness in handling large datasets.

Use such libraries to speed-up operations.

NumPy Arrays¶

NumPy is the golden standard for scientific and high-performance computing with Python. NumPy arrays outpace significantly common Python lists in processing massive data and performing numerical computations. consuming less memory than lists.

Do Not Use .dot Operation¶

Dot operations may be time-consuming!

Function with a . (dot) first call __getattribute()__ or __getattr()__, which then uses a dictionary operation. This adds some overhead. It is recommended to import functions for optimizing Python code for speed.

Intern Strings in Python¶

Danger

Explain.-

Generator expressions¶

Use generator expressions instead of list comprehensions

Apply multiple assignements¶

Instead of doing

a = 3
b = 6
c = 9

better do

a, b, c = 3, 6, 9

This approach optimizes and speeds up the code execution.

Peephole Optimization¶

Code readability often comes with cost in terms of efficiency as the language interpreter automatically calculates constant expressions. The peephole technique means to let Python pre-compute such expressions, replace repetitive instances with the result and employ membership tests. This may help to avoid performance decrease and boost software performance.

Data structure Optimization¶

Optimize in-advance massive time series data structures :

to be contiguous time
small chunks of data in space
Handle massive time series data programmatically by using efficient data structures like NumPy Arrays.

Data Classes¶

Refactor PVGIS' custom Python dataclasses for efficiency
Use alternatives from well-known libraries :
- Python's dataclasses or attrs ?

Example : Python Data Class

Use a Python dataclass as a decorator to add special methods to classes :

from dataclasses import dataclass

@dataclass
class ExampleClass:
    attribute1: int
    attribute2: str

example = ExampleClass(123, "test")

Caching/Memoization Strategies¶

Intermediate outputs¶

Some core API functions, though they produce output for different calculated quantities, may depend on idenctical intermediate components. Hence, it is important to experiment, understand and apply local and distributed caching strategies.

Caching the output of frequently requested functions or data, for example using lru_cache or similar mechanisms, at the core API level, can decrease the computation time for functions with shared dependencies and consequently reduce response times and server load significantly.

For local caching, consider Python's built-in caching tools like functools.lru_cache for caching the output of functions, especially for functions with expensive or repetitive computations.
Example : Caching/Memoization with functools.lru_cache
```
from functools import lru_cache

@lru_cache(maxsize=100)
def expensive_function(arg):
    # Time-consuming computations
    return result
```
The non-hashable nature of Pandas' DateTimeIndex can be a limitation in the context of caching. Are there alternative data structures or methods for handling timestamps ?

Danger

Does not work with non-hashable data structures!
For distributed caching, consider tools like Redis or Memcached.
Caching repeatedly requested final output calls at the Web API level ?

Asynchronous operations¶

Asynchronous execution for I/O-bound operations can improve the performance of network operations, the responsiveness and the efficiency of I/O tasks. It can be implemented using Python’s asyncio module

Example : Asynchronous Execution with asyncio

import asyncio

async def async_task():
    # Perform async operations
    return result

# Running async tasks
asyncio.run(async_task())

Concurrent operations¶

For CPU-bound tasks, explore Python’s multiprocessing or multithreading (if tasks are I/O-bound) to distribute computations and enhance performance.

Many in-between calculations do not depend on each other and can, therefore, be executed independently. Use Python's concurrent.futures or similar libraries to manage concurrent tasks.

Example : Concurrent Executions with concurrent.futures

from concurrent.futures import ThreadPoolExecutor

def function_to_run_concurrently(arg):
    # Operations
    return result

with ThreadPoolExecutor(max_workers=5) as executor:
    future = executor.submit(function_to_run_concurrently, (arg))
    return_value = future.result()

For independent calculations explore libraries like concurrent.futures to manage concurrent tasks efficiently.

Parallel operations¶

Use parallel processing techniques and software to handle intensive computational tasks.

Use Python’s multiprocessing module for CPU-bound tasks to distribute computations across multiple cores.

Example : Parallel Processing with multiprocessing

from multiprocessing import Pool

def function_to_run_in_parallel(arg):
    # Operations
    return result

if __name__ == "__main__":
    with Pool(processes=4) as pool:
        results = pool.map(function_to_run_in_parallel, iterable_of_args)

Optimizing Pandas Usage¶

Use vectorized operations and efficient data handling in Pandas.

Example : Vectorised operation using Pandas

import pandas as pd

# Example: Vectorized operation instead of a loop
df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
df['c'] = df['a'] + df['b']  # Vectorized addition

Algorithmic Efficiency¶

Optimizing the fundamental algorithms and the core API that power PVGIS, can reduce the computational complexity, which in turn may speed up operations significantly. This is crucial for handling efficiently large datasets, and perform complex calculations.

The focus is on :

reviewing and refactoring core algorithms to reduce complexity
use systematically efficient libraries like NumPy and SciPy for numerical computations.
best programming practices like avoiding Python's currently inefficient for loop

High Performance Computation with Python ?¶

Explore the great potential of using external libraries/frameworks for High Performance Computing to boost the performance of PVGIS.

Compilers/Just-in-Time Compilers
- PyPy: A Just-In-Time (JIT) compiler for Python.
- mypyc: A compiler that compiles Python to C-extension modules.
- Pyjion: A JIT compiler for Python, using the .NET CLR.
- Cython: an optimising static compiler for Python and Cython, allows writing C extensions for Python.

Cython gives you the combined power of Python and C to let you

Libraries/Frameworks
- Jax: A library for numerical computations with auto-differentiation and GPU/TPU support.
- GT4Py: A framework for writing stencil computations in geosciences.
- Pythran: A compiler-like tool that converts Python to optimized C++ code, but also acts as a library.
- Dace: An framework for data-centric parallel programming with support for Ahead-of-Time (AoT) compilation in addition to JIT.

Load Balancing¶

Note

A task mainly for and to work collaboratively with the IT support team

Implement load balancing
Distribute API requests evenly across multiple servers
Enhance scalability and reliability
Collaborate with the IT support team for implementing load balancing. This includes distributing API requests across servers and enhancing system scalability and reliability.

References¶

How not to lie with statistics: the correct way to summarize benchmark results

https://www.codeease.net/programming/python/python-slow-print ↩