Python

  • How to print a tree (Tweet)

    def print_tree(tree, indent=2, level=0):
        for name, child in tree.items():
    	    print(' '*indent*level + name)
    	    print_tree(child, indent, level+1)
    
  • How to get the args names from a function?

    import inspect
    inspect.getfullargspec(<func>).args
    
    In [1]: import inspect
    
    In [2]: def f(x,y):
        ...:     pass
        ...: 
    
    In [3]: inspect.getfullargspec(f).args
    Out[3]: ['x', 'y']
    
  • Why avoid initing Decimal from float (Tweet)

    In [1]: from decimal import Decimal
    
    In [2]: Decimal(0.1)
    Out[2]: Decimal('0.1000000000000000055511151231257827021181583404541015625')
    
    In [3]: Decimal('0.1')
    Out[3]: Decimal('0.1')
    
  • Why avoid using mutable objects as default args (Tweet)

    In [1]: def f(k, v, d={}):
       ...:    d[k] = v
       ...:    return d
    
    In [2]: f("x", 1)
    Out[2]: {'x': 1}
    
    In [3]: f("y", 2)
    Out[3]: {'x': 1, 'y': 2}
    
  • How to merge PDFs (Tweet)

    from pypdf import PdfWriter
    w = PdfWriter()
    w.append("first.pdf")
    w.append("second.pdf")
    w.write("merged.pdf")
    w.close()
    
  • Updating list in-place (Tweet)

    l2 = l1 = [1, 2]
    l2[:] = ['a', 'b']
    print(l1 is l2, l1, l2)
    # True ['a', 'b'] ['a', 'b']
    
  • https://github.com/haralyzer/haralyzer/ - Lib to read HAR files #tools

  • [extras.pipfile_deprecated_finder.2] 'pip-shims<=0.3.4' does not match '^[a-zA-Z-_.0-9]+$ #troubleshooting - pre-commit autoupdate

  • How to move from pip to Poetry (Tweet)

    poetry init
    cat requirements.txt | cut -d '=' -f 1 | xargs poetry add
    cat requirements-dev.txt | cut -d '=' -f 1 | xargs poetry add --group=dev
    rm requirements.txt requirements-dev.txt
    poetry install
    poetry run <command>
    
  • How to use global packages using Poetry?

    pip install pipx
    pipx install <package>
    
  • Install requirements from git using ssh

    pip install git+ssh://git@github.com/<org>/<repo>
    

Anti-Patterns Link to heading

Auth Link to heading

  • Authlib - The ultimate library in building OAuth and OpenID Connect servers

Background tasks Link to heading

Relates to Message Queues

Cache Link to heading

CLI Link to heading

Data Link to heading

pandas Link to heading

  • https://www.pola.rs/ - Lightning-fast DataFrame library for Rust and Python
  • axios: 0=linha e 1=coluna
  • pandas-profilling - PyPI, docs
  • Geral
    df.shape  # (linhas, colunas)
    df.info()
    df.High.mean()  # média da coluna High
    df.Date = pd.to_datetime(df.Date)  # convert column to datetime
    
  • Informações Estatísticas
    df.describe()  # informações estatísticas
    df.ride_duration.std()  # desvio padrão da coluna ride_duration
    
  • Visualização
    df.High.plot()  # gráfico da coluna High
    df.Volume.hist()  # histograma da coluna Volume
    df.plot.scatter('c1', 'c2')  # gráfico de dispersão
    df.Low.plot(kind='box')  # gráfico boxplot
    
  • Valores ausentes
    df.isnull().sum()  # conta o número de linhas com NaN
    df.isnull().sum() / df.shape[0] # % de valores ausentes
    df.dropna(subset=['user_gender'], axios=0)  # apaga as linhas com valor NaNs da coluna user_gender
    

Dataclasses Link to heading

JSON Link to heading

  • cysimdjson - SIMDJSON is C++ JSON parser, reportedly the fastest JSON parser on the planet.
  • ijson - iterative JSON
  • orjson - fast, supports NumPy
  • rapidjson - RapidJSON is an extremely fast C++ JSON parser and serialization library
  • ujson - written in C with Python bindings

ORM Link to heading

  • PugSQL - simple interface for using parameterized SQL

Pipelines Link to heading

AI/Data Link to heading

General Link to heading

  • Airflow GitHub Repo stars
  • ⭐️ Joblib GitHub Repo stars
    • set of tools to provide lightweight pipelining.
    • Main features: disk-caching; parallel helper; fast compressed persistence.
    • How cache works? use hash to compare args
  • Luigi GitHub Repo stars
    • helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.
  • Mara GitHub Repo stars
    • Principles: Data integration pipelines as code; PostgreSQL as a data processing engine; Extensive web ui; No in-app data processing; multiprocessing - single machine pipeline execution; nodes with higher cost are run first
  • Mistral GitHub Repo stars
    • integrated with OpenStack
    • define tasks and workflows in a simple YAML and a distributed environment
  • Ploomber GitHub Repo stars - Docs Link to heading

  • pygrametl GitHub Repo stars
    • provides commonly used functionality for the development of ETL processes.
  • Pypeln GitHub Repo stars
    • for creating concurrent data pipelines
    • Main Features: Simple; Easy-to-use; Flexible; Fine-grained Control.
    • Queues: Process; Thread; Task.
  • ⭐️ pypyr GitHub Repo stars
    • task runner for automation pipelines
    • script sequential task workflow steps in yaml
    • conditional execution, loops, error handling & retries
  • SCOOP GitHub Repo stars
    • distributed task module allowing concurrent parallel programming on various environments, from heterogeneous grids to supercomputers.
    • designed from the following ideas: the future is parallel; simple is beautiful; parallelism should be simpler.
    • brokers: TCP and ZeroMQ
  • SpiffWorkflow GitHub Repo stars
    • workflow engine implemented in pure Python.
    • support the development of low-code business applications in Python. Using BPMN will allow non-developers to describe complex workflow processes in a visual diagram
    • Built with: lxml; celery.

Profiling Link to heading

ProfilerWhatGranularityHow
timeitrun timesnippet-level
cProfilerun timemethod-leveldeterministic
statprof.pyrun timemethod-levelstatictical
line_profilerrun timeline-leveldeterministic
memory_profilermemoryline-level+- deterministic
pymplermemorymethod-leveldeterministic
Fonte: https://www.youtube.com/watch?v=DUCMjsrYSrQ

PyCharm Link to heading

PyPI mirror Link to heading

Retry Link to heading

Strings Link to heading

Formatting Link to heading

% operator (Tweet) Link to heading

  • %s: String conversion.
  • %d or %i: Integer conversion.
  • %f: Float conversion.
  • %o: Octal conversion.
  • %x or %X: Hexadecimal conversion.
  • %e or %E: Exponential notation conversion.
In [1]: "%s %d %f %o %x %e" % ("a", 1, 1.0, 8, 16, 100)
Out[1]: 'a 1 1.000000 10 10 1.000000e+02'

f-string Link to heading

Fonte: https://fstring.help/

  • debugging (Tweet)
    user = "eric_idle"
    f"{user=}"
    # "user='eric_idle'"
    f"{user = }"
    # "user = 'eric_idle'"
    
  • padding (Tweet)
    value = "test"
    f"{value:>10}"
    # '      test'
    f"{value:<10}"
    # 'test      '
    f"{value:_<10}"
    # 'test______'
    f"{value:^10}"
    # '   test   '
    

Parsing Link to heading

  • parse - Parse strings using a specification based on the Python format() syntax.
  • ttp - Template Text Parser

Tests Link to heading

  • How to print logs when running pytest? (Tweet)
    pytest --log-cli-level DEBUG
    

Fixtures Link to heading

  • Why to fixtures instead of namespace variables for mocked data (Tweet)

    # without fixture
    
    MOCK_DATA = [{"field": "value"}]
    
    
    def test_one():
        MOCK_DATA[0]['field'] = 'other value'
        assert MOCK_DATA[0]['field'] == 'other value'
    
    
    def test_two():
        assert MOCK_DATA[0]['field'] == 'value'
    
    
    # with fixture
    
    @pytest.fixture
    def mock_data():
    	return [{"field": "value"}]
    
    
    def test_three(mock_data):
        mock_data[0]['field'] = 'other value'
        assert MOCK_DATA[0]['field'] == 'other value'
    
    
    def test_four(mock_data):
        assert mock_data[0]['field'] == 'value'
    
        def test_two():
    >       assert MOCK_DATA[0]['field'] == 'value'
    E       AssertionError: assert 'other value' == 'value'
    E         - value
    E         + other value
    
    path/to/tests/test_zero.py:15: AssertionError
    =====================<mark> 1 failed, 3 passed in 0.18s </mark>=====================
    

Speccing Link to heading

  • Why to use spec when using Mock? (Tweet)
    from unittest.mock import Mock
    
    class MyClass:
    	pass
    
    # without spec
    Mock().wrong_method()
    # Out: <Mock name='mock.wrong_method()' id='140607049530000'>
    
    # with spec
    Mock(spec=MyClass).wrong_method()
    # raises "AttributeError: Mock object has no attribute 'wrong_method'"
    
  • Why to use autospec? (Tweet)
    from unittest.mock import create_autospec, Mock
    
    class MyClass:
        myobj = object
    
    # without autospec
    Mock(spec=MyClass).myobj.wrong_method()
    # Out: <Mock name='mock.myobj.wrong_method()' id='140671042320272'>
    
    # with autospec
    create_autospec(MyClass).myobj.wrong_method()
    # raises "AttributeError: Mock object has no attribute 'wrong_method'"
    

Web Link to heading

GraphQL Server Link to heading

Keycloak Link to heading

requests Link to heading

RPC Link to heading

  • gRPC
  • RPyC - Docs - library for symmetrical remote procedure calls, clustering, and distributed-computing #OpenSource

SSH Link to heading

WebAssembly Link to heading

  • Extism - The cross-language framework - SDK
  • Pyodide - distribution for the browser and Node.js based on WebAssembly - GitHub
  • PyScript - Run Python in Your HTML