If you do not know what __path__ is,

    this talk is   NOT for you.
                   Sorry.
import this,
                  that, and the
                   other thing
                     Custom importers in Python


                                   Brett Cannon
                                  www.DrBrett.ca
                                 brett@python.org


Slides are sparse, so do listen to what I say.
Thanks ...


• Python Software Foundation
 • PyCon Financial Aid committee
• Nasuni
 • Jesse Noller
What the heck is an
             importer?



Relevant since Python 2.3
importer
       =
finder + loader
A finder finds
   modules.
A loader loads
   modules.
“Why do I want one?”



Customization/control, easier to work w/ than __import__
How are custom
             importers used by
                  import?


Simplified view; ignoring implicit importers
Meta path
 sys.meta_path
Start




                          for finder in sys.meta_path:        ...


      False


                     loader = finder.find_module(name, path)




                                    True


                       return loader.load_module(name)


What ‘path’ arg is
Path
sys.path or __path__, sys.path_hooks,
      & sys.path_importer_cache
...   Parent module
      has __path__
                             False       search = sys.path

                                                                  search


                      True           search = parent's __path__
Search




              for entry in search:          raise ImportError




        finder = sys.path_importer_cache               path
                     [entry]
                                          False
                                                      hook



                                          finder
False
                     True




        loader = finder.find_module(name)




                     True


        return loader.load_module(name)
path
                       hook




                    for hook in
                                               sys.path_importer_cache[entry] = dummy
                 sys.path_hooks:

 False


                finder = hook(entry)



                       True


      sys.path_importer_cache[entry] = finder                   finder




True/False = ImportError (not) raised
how do I write
   my own
  importer?
 Only masochists need apply.
Option 1:
 Painfully from
     scratch
Read PEP 302 for the gory details.
Option 2:
Use importlib
     Available since Python 3.1.
I have suffered so you don’t have to.
Option 3:
                         importers
               http://packages.python.org/importers/

           File path abstraction on top of importlib.
              Treating as purgatory for importlib
                           inclusion.




If a lesson here, then it is to use option 2 or 3 depending on your needs.
Rest of talk is about lessons that led to ‘importers’.
Using a
                   zipfile importer
                    as an example



Assuming use of importlib.
Talking from perspective of using an archive.
we need a hook
   For sys.path_hooks.
Refresher:
              Hooks look for a
              finder for a path


Path either from sys.path or __path__
Hooks can get
                funky paths
            E.g. /path/to/file/code.zip/some/pkg




Search backwards looking for a file; find a directory then you have gone too far.
Consider caching
         archive file objects



No need to keep 3 connection objects open for the same sqlite3 file
Pass your finder
            the “location”:
          1)the path/object &
          2) the package path


Import assumes you are looking in a part of a package.
Raise ImportError
if you got nuthin’
Have finder, will
 look for code
Don’t treat modules
        as code but as files



Just trust me. Too many people/code make this assumption already for stuff like __file__,
__path__, etc.
You did remember
               where in the
              package you are
             looking, RIGHT?!?


Needed because of __path__ manipulation by user code.
fullname.rpartition(‘.’)[-1]
Need to care
        about packages &
            modules
                      some/pkg/name/__init__.py
                                 and
                          some/pkg/name.py




Care about bytecode if you want.
Notice how many stat calls this takes?
Avoid caching
within a finder
 Blame sys.path_importer_cache
Tell the loader
  if package &
  path to code
Don’t Repeat Yourself ... within reason.
Nuthin’?
Give back None
Now it gets
  tricky
  Writing a loader.
Are you still
thinking in terms of
    file paths?
importlib.abc.PyLoader


           • source_path()
              • Might be changing...
           • is_package()
           • get_data()



Everything in terms of exactly what it takes to import source
importlib.abc.PyPycLoader

           • source_path()
           • is_package()
           • get_data()
           • source_mtime()
           • bytecode_path()
              • Might be changing...


This is what is needed to get source w/ bytecode right
Reasons to ignore .pyc

• Jython, IronPython couldn’t care less.
 • Safe to support, though.
• Another thing to code up.
 • Bytecode is just an optimization.
 • If you only ship .pyc for code
   protection, stop it.
What to do when
     using
importlib ABCs
Require anchor
            point for paths
                 somewhere/mod.py is too ambiguous




Too hazy as to where a relative path is anchored; archive? Package location?
Consider caching
            stat calls
                Only for stand-alone loaders!
            Also consider caching if package or not.




Consider whether storage is read-only, append-only, or read-write.
Don’t overdo
error checking
   EAFP is your friend.
Perk of importers is
  the abstraction
Lazy loader mix-in
    written in
     19 lines
class Module(types.ModuleType):
    pass

class Mixin:
    def load_module(self, name):
        if name in sys.modules:
             return super().load_module(name)
        # Create a lazy module that will type check.
        module = LazyModule(name)
        # Set the loader on the module as ModuleType will not.
        module.__loader__ = self
        # Insert the module into sys.modules.
        sys.modules[name] = module
        return module

class LazyModule(types.ModuleType):
    def __getattribute__(self, attr):
        # Remove this __getattribute__ method by re-assigning.
        self.__class__ = Module
        # Fetch the real loader.
        self.__loader__ = super(Mixin, self.__loader__)
        # Actually load the module.
        self.__loader__.load_module(self.__name__)
        # Return the requested attribute.
        return getattr(self, attr)
... or you could use
the importers package
 http://packages.python.org/importers/
Fin

ImpoImport this, that, and the other thing: custom importersrt not for_y

  • 1.
    If you donot know what __path__ is, this talk is NOT for you. Sorry.
  • 2.
    import this, that, and the other thing Custom importers in Python Brett Cannon www.DrBrett.ca brett@python.org Slides are sparse, so do listen to what I say.
  • 3.
    Thanks ... • PythonSoftware Foundation • PyCon Financial Aid committee • Nasuni • Jesse Noller
  • 4.
    What the heckis an importer? Relevant since Python 2.3
  • 5.
    importer = finder + loader
  • 6.
  • 7.
  • 8.
    “Why do Iwant one?” Customization/control, easier to work w/ than __import__
  • 9.
    How are custom importers used by import? Simplified view; ignoring implicit importers
  • 10.
  • 11.
    Start for finder in sys.meta_path: ... False loader = finder.find_module(name, path) True return loader.load_module(name) What ‘path’ arg is
  • 12.
    Path sys.path or __path__,sys.path_hooks, & sys.path_importer_cache
  • 13.
    ... Parent module has __path__ False search = sys.path search True search = parent's __path__
  • 14.
    Search for entry in search: raise ImportError finder = sys.path_importer_cache path [entry] False hook finder False True loader = finder.find_module(name) True return loader.load_module(name)
  • 15.
    path hook for hook in sys.path_importer_cache[entry] = dummy sys.path_hooks: False finder = hook(entry) True sys.path_importer_cache[entry] = finder finder True/False = ImportError (not) raised
  • 16.
    how do Iwrite my own importer? Only masochists need apply.
  • 17.
    Option 1: Painfullyfrom scratch Read PEP 302 for the gory details.
  • 18.
    Option 2: Use importlib Available since Python 3.1. I have suffered so you don’t have to.
  • 19.
    Option 3: importers http://packages.python.org/importers/ File path abstraction on top of importlib. Treating as purgatory for importlib inclusion. If a lesson here, then it is to use option 2 or 3 depending on your needs. Rest of talk is about lessons that led to ‘importers’.
  • 20.
    Using a zipfile importer as an example Assuming use of importlib. Talking from perspective of using an archive.
  • 21.
    we need ahook For sys.path_hooks.
  • 22.
    Refresher: Hooks look for a finder for a path Path either from sys.path or __path__
  • 23.
    Hooks can get funky paths E.g. /path/to/file/code.zip/some/pkg Search backwards looking for a file; find a directory then you have gone too far.
  • 24.
    Consider caching archive file objects No need to keep 3 connection objects open for the same sqlite3 file
  • 25.
    Pass your finder the “location”: 1)the path/object & 2) the package path Import assumes you are looking in a part of a package.
  • 26.
  • 27.
    Have finder, will look for code
  • 28.
    Don’t treat modules as code but as files Just trust me. Too many people/code make this assumption already for stuff like __file__, __path__, etc.
  • 29.
    You did remember where in the package you are looking, RIGHT?!? Needed because of __path__ manipulation by user code.
  • 30.
  • 31.
    Need to care about packages & modules some/pkg/name/__init__.py and some/pkg/name.py Care about bytecode if you want. Notice how many stat calls this takes?
  • 32.
    Avoid caching within afinder Blame sys.path_importer_cache
  • 33.
    Tell the loader if package & path to code Don’t Repeat Yourself ... within reason.
  • 34.
  • 35.
    Now it gets tricky Writing a loader.
  • 36.
    Are you still thinkingin terms of file paths?
  • 37.
    importlib.abc.PyLoader • source_path() • Might be changing... • is_package() • get_data() Everything in terms of exactly what it takes to import source
  • 38.
    importlib.abc.PyPycLoader • source_path() • is_package() • get_data() • source_mtime() • bytecode_path() • Might be changing... This is what is needed to get source w/ bytecode right
  • 39.
    Reasons to ignore.pyc • Jython, IronPython couldn’t care less. • Safe to support, though. • Another thing to code up. • Bytecode is just an optimization. • If you only ship .pyc for code protection, stop it.
  • 40.
    What to dowhen using importlib ABCs
  • 41.
    Require anchor point for paths somewhere/mod.py is too ambiguous Too hazy as to where a relative path is anchored; archive? Package location?
  • 42.
    Consider caching stat calls Only for stand-alone loaders! Also consider caching if package or not. Consider whether storage is read-only, append-only, or read-write.
  • 43.
    Don’t overdo error checking EAFP is your friend.
  • 44.
    Perk of importersis the abstraction
  • 45.
    Lazy loader mix-in written in 19 lines
  • 46.
    class Module(types.ModuleType): pass class Mixin: def load_module(self, name): if name in sys.modules: return super().load_module(name) # Create a lazy module that will type check. module = LazyModule(name) # Set the loader on the module as ModuleType will not. module.__loader__ = self # Insert the module into sys.modules. sys.modules[name] = module return module class LazyModule(types.ModuleType): def __getattribute__(self, attr): # Remove this __getattribute__ method by re-assigning. self.__class__ = Module # Fetch the real loader. self.__loader__ = super(Mixin, self.__loader__) # Actually load the module. self.__loader__.load_module(self.__name__) # Return the requested attribute. return getattr(self, attr)
  • 47.
    ... or youcould use the importers package http://packages.python.org/importers/
  • 48.