Advanced Python
Projects, Session 7
7.1 | class Config: contructs an object to read configuration values (key/value pairs) from a file. |
pconfig.csv: a CSV file of config keys/values:
db_uname,george db_password,password1 data_query,"SELECT this, that FROM mytable WHERE col = 5" |
|
|
|
Demonstration program using the Config class -- your class should behave as shown below:
# INSTANTIATE THE INSTANCE: READ DATA VALUES FROM FILE,
# AND STORE THEM IN THE INSTANCE (using an internal dict)
conf = Config('pconfig.csv')
# .get() RETRIEVES A VALUE BASED ON A KEY (check these against CSV above)
# or None or default value if not found
print(conf.get('db_uname')) # george
print(conf.get('db_password')) # password1
print(conf.get('data_query')) # SELECT this, that FROM ...
print(conf.get('badkey')) # missing key returns None
print(conf.get('badkey', default=9)) # 9 (missing key returns default)
# FILENAME AND FILE EXTENSION ARE ALSO STORED IN THE INSTANCE
print(conf.filename) # pconfig.csv
print(conf.format) # csv (consider os.path.splitext())
# PRINTING THE INSTANCE CALLS THE __str__() METHOD
# THE STRING RETURNED FROM THIS METHOD IS PRINTED
print(conf) # this prints the string Config('pconfig.csv')
# (consider self.__class__.__name__)
|
|
|
|
Docstrings for Class and Methods. Place a "floating" string just inside the class statement and inside each def statement as shown below. These strings can be picked up by a doc reader.
class Config: """ class for producing config instances ... [PLEASE DESCRIBE FURTHER] """ def __init__(self, filename): """ constructor ... [PLEASE DESCRIBE FURTHER] """ |
|
|
|
conf = Config('corrupted.csv') # ValueError: "corrupted.csv" does not appear to be
# in proper CSV format (2 fields per line)
|
|
How to detect a corrupted file? The data file should have two values per line. If your program sees that there are not exactly two values in a line, it should raise a ValueError exception with a custom message (DO NOT print the error message! DO NOT exit from the method!). corrupted.csv has been provided as a file that is not in proper format. PLEASE NOTE you must NEVER print() an error message from a function or method. Use raise and allow the exception to occur. print() statements can easily be missed in the user's output. PLEASE NOTE you must NEVER exit() from a library function or method. It should not be the responsibility of a utility class to exit a program that imports or uses it. For added challenge, inform the user on what line too few or too many values was found |
|
conf = Config('corrupted.csv') # ValueError: error detected in
# "corrupted.csv", line 2.
|
|
Remember, NEVER print() OR exit() from a function or method in a module or class. To assist in organizing the program, I recommend the following methods:
|
|
7.2 | class ConfigItems: copy class Config and add __getitem__(). This method should do what .get() does, but without the default= argument functionality. __getitem__() will allow the user to access the config file as if it were a dict. |
conf = ConfigItems('pconfig.csv')
# invokes method __getitem__() instead of get()
print(conf['db_uname']) # george
print(conf['db_password']) # password1
print(conf['data_query']) # SELECT this, that FROM mytable WHERE col = 5
# this method still works
print(conf.get('db_uname')) # george
print(conf['badkey']) # raises KeyError (use raise, DO NOT PRINT!)
|
|
In order to emulate the behavior of a dict, your __getitem__() must raise a KeyError exception if the key can't be found. In other words, __getitem__() will behave like get(), but without the default= argument funtionality that get() employs. It's worth reiterating that functions and methods of classes and modules must never print error messages! They must not call exit(). Instead, they use the 'raise' statement to raise exceptions to signal errors. |
|
7.3 | (Extra Credit). class ConfigDict inheriting from dict: this class extends the Config class so that it inherits from the dict class and attempts to replicate all dict behavior in addition to its work in getting values from the file. |
If you inherit from dict, then you can use the instance itself as the dict rather than using self.params. Reading keys and values from the instance: inheriting from dict means you can inherit from several dict "read" methods, which do not need to be implemented: reading a key based on a value (__getitem__(), looping through a dict keys, methods .keys() and .values(), etc. |
|
Warning: inheriting from dict means you will operate on self, however inheriting also presents a particular problem -- the problem of recursion, in which a function calls itself:
def __getitem__(self, key): value = self[key] # calls self.__getitem__(key) |
|
The above method uses item assignment (value = self[key]), which again calls __getitem__(self, key), which again calls __getitem__(self, key)... |
|
If you do this, you'll see a new type of error:
RecursionError: maximum recursion depth exceeded |
|
This means that Python stopped after it observed 100 consecutive calls to __getitem__(). |
|
The way to avoid this problem is by using the parent class to perform the method:
def __getitem__(self, key): dict.__getitem__(self, key) |
|
__init__(self, *args, **kwargs) Because it is possible to initialize a dict with argument (usually a list of key/value pairs), to fully implement a dict-like object you would need to create an __init__() method that a) receives "any" arguments (similar to what was done in the decorator assignment) and then passes those arguments to super().__init__(). Contact me if you need clarification on this feature. |
|
7.4 | (Extra Credit). Allow class Config to parse .json and .ini files. |
Allow the configuration file to be in any of 3 formats:
|
|
The program uses the extension (stored in the .format attribute) to determine how to parse the data. (sample config files pconfig.csv, pconfig.json and pconfig.ini are provided.) |
|
.ini files use a simple key=value format. Note the equals sign may appear in the data -- use .split('=', maxsplit=1) to split only once.
data_query=SELECT this, that FROM mytable WHERE col = 5 db_uname=george db_password=password1 email_from=joe@wilson.com |
|
Exception to Raise - if the user passes a file with an extension the program does not recognize, it should raise a ValueError:
#obj = Config('pconfig.xxx') # ValueError: file type "xxx" not recognized |
|
If the extension of the file is not csv, json or ini, your code should raise a ValueError exception with the above message, naming the extension and explaining that the file type is not recognized. Detecting corruption - I have provided corrupted.ini and corrupted.json, each of which cannot be parsed properly. The .ini file has the same issue as the corrupted .csv file: a missing '=' delimiter. So the corruption can be detected in a way similar to that of the .csv file. The corruption in the .json file, however, must be detected in a different way: if the JSON parser used in json.load detects a missing structural character (for example, if a quotation, colon or brace is missing) it will raise a json.decoder.JSONDecodeError exception. (The exception name is written differently because it is defined in the json module.) First, try to read the corrupted.json file and note that the exception is raised. Next, wrap the line that triggered the exception in a try: block, and follow with an except block that names the above exception (never use except: or except Exception: because it traps any exception). Finally, raise your own ValueError exception with message as with csv. |
|
7.5 | (Extra credit.) Write a simple doc reader. The doc reader takes any class, loops through its attributes and, if the attribute instance is callable, prints its name and its docstring. |
This is done principally through the inspect module (inspect.getmembers(), inspect.signature(), the built-in callable() function and the class or method's .__doc__ attribute. See slides or discussion for details. |
|
When reading the Config class, the program will output several inherited magic attributes and descriptions...
__class__ type(object_or_name, bases, dict) type(object) -> the object's type type(name, bases, dict) -> a new type __delattr__(self, name, /) Implement delattr(self, name). ... continues ... |
|
...eventually printing names and docs for methods defined in the Config class. (The below are the docs I wrote, not yours which should be your own.)
... getvalue given a key, return the value found in the object read_csv_file open the config file for reading and read the file contents into a dict in the object ... continues ... |
|
Finally, use inspect.signature() to also find the argument signature for each class:
... getvalue(self, key) given a key, return the value found in the object read_csv_file(self) open the config file for reading and read the file contents into a dict in the object ... continues ... |
|
Note that one callable attribute, class, will fail with ValueError if you attempt to call inspect.signature(). Trap this exception and skip over signature for this attribute. |
|
7.6 | (Extra Credit).class AttrType (this assignment has a snag that has stumped many, so beware!): an object that is initialized with a type (int, str, etc. -- any type is possible) and makes sure that attribute values set in the instance are of those type(s), raising a TypeError if an incorrect type is attempted to be set. (Use isinstance() to validate type.) |
x = AttrType(int)
x.a = 5
x.a = 10 # allows attribute to be overwritten
print(x.a) # 10
x.b = 'hey' # ValueError: "hey" is not <class 'int'>
|
|
The __init__ constructor also accepts an optional argument writeonce=True, to indicate if attributes can be set only once. (Use hasattr() to see if the object already holds the attribute.) |
|
y = AttrType(str, writeonce=True)
y.a = 'hey'
y.a = 'hello' # ValueError: can't set attribute more than once
|
|
__init__ will set two attributes: _type (the initialized type) and _writeonce (indicating if attributes can be set more than once). These attributes start with a single underscore because they are "non-public" attributes -- see variable naming conventions.) The class implements the __setattr__ method to validate attributes before setting. It uses internal instance attributes _type to validate the type and _writeonce to disallow overwriting (if applicable). The method also makes sure that these internal attributes cannot be overwritten. However, beware setting attributes on the instance directly! Remember that anytime you set an attribute on self, Python will attempt to call __setattr__, calling the method again, potentially causing infinite recursion (you would see the error Maximum recursion depth reached). Instead, call the parent class' __setattr__ method: object.__setattr__(self, attrname, attrval), where attrname and attrval are the attribute name and value. We are also seeing an issue where Python seems to not be able to see an attribute that was set on the object. The class also implements two methods: as_list() returns the attribute values as a list, and as_dict() returns the attribute dict (however, make sure the return values don't contain the _type or _writeonce attributes). |
|
z = AttrType(float)
x.a = 5.5
x.b = 10.0
print(x.as_list()) # [5.5, 10.0]
print(x.as_dict()) # {'a': 5.5, 'b': 10.0}
|
|
Extra credit: allow the type argument 'number' (passed as a string, not an object type). In order to validate, use isinstance() on the parent type: isinstance(var, numbers.Number) (imported from the numbers module). This will allow the user to assign any number (including types float, int or long). |
|
z = AttrType('number')
z.a = 5.5
z.b = 10
z.c = 'hello' # ValueError: "hello" is not a number
|
|