Python 从文件中读取数据

https://nostarch.com/pythoncrashcourse2e
https://ehmatthes.github.io/pcc_2e/
https://ehmatthes.github.io/pcc/

1 从文件中读取数据

要使用文本文件中的信息，首先需要将信息读取到内存中。可以一次性读取文件的全部内容，也可以以每次一行的方式逐步读取。

1.1 读取整个文件

要读取文件，需要一个包含几行文本的文件。文件包含精确到小数点后 30 位的圆周率值，且在小数点后每 10 位处都换行：
pi_digits.txt

3.1415926535
  8979323846
  2643383279

file_reader.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_digits.txt'

    with open(filename) as file_object:
        contents = file_object.read()
        print(contents)

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
3.1415926535
  8979323846
  2643383279


Process finished with exit code 0

要以任何方式使用文件，都得先打开文件，这样才能访问它。函数 open() 接受一个参数：要打开的文件的名称。Python 在当前执行的文件所在的目录中查找指定的文件 pi_digits.txt。
函数 open() 返回一个表示文件的对象。在这里，open(‘pi_digits.txt’) 返回一个表示文件 pi_digits.txt 的对象；Python 将这个对象存储在我们将在后面使用的变量中。

关键字 with 在不再需要访问文件后将其关闭。在这个程序中，注意到我们调用了 open()，但没有调用 close()；你也可以调用 open() 和 close() 来打开和关闭文件，但这样做时，如果程序存在 bug，导致 close() 语句未执行，文件将不会关闭。这看似微不足道，但未妥善地关闭文件可能会导致数据丢失或受损。如果在程序中过早地调用 close()，你会发现需要使用文件时它已关闭 (无法访问)，这会导致更多的错误。并非在任何情况下都能轻松确定关闭文件的恰当时机，但通过使用前面所示的结构，可让 Python 去确定：你只管打开文件，并在需要时使用它，Python 自会在合适的时候自动将其关闭。

有了表示 pi_digits.txt 的文件对象后，方法 read() 读取这个文件的全部内容，并将其作为一个长长的字符串存储在变量 contents 中。这样，通过打印 contents 的值，就可将这个文本文件的全部内容显示出来。

相比于原始文件，该输出唯一不同的地方是末尾多了一个空行。因为 read() 到达文件末尾时返回一个空字符串，而将这个空字符串显示出来时就是一个空行。要删除多出来的空行，可在 print 语句中使用 rstrip()：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_digits.txt'

    with open(filename) as file_object:
        contents = file_object.read()
        print(contents.rstrip())

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
3.1415926535
  8979323846
  2643383279

Process finished with exit code 0

Python 方法 rstrip() 删除 (剥除) 字符串末尾的空白。

1.2 文件路径

Python 在当前执行的文件 (.py 程序文件) 所在的目录中查找文件。
你可能将程序文件存储在了文件夹 python_work 中，而在文件夹 python_work 中，有一个名为 text_files 的文件夹，用于存储程序文件操作的文本文件。虽然文件夹 text_files 包含在文件夹 python_work 中，但仅向 open() 传递位于该文件夹中的文件的名称也不可行，因为 Python 只在文件夹 python_work 中查找，而不会在其子文件夹 text_files 中查找。要让 Python 打开不与程序文件位于同一个目录中的文件，需要提供文件路径，它让 Python 到系统的特定位置去查找。
由于文件夹 text_files 位于文件夹 python_work 中，因此可使用相对文件路径来打开该文件夹中的文件。相对文件路径让 Python 到指定的位置去查找，而该位置是相对于当前运行的程序所在目录的。在 Linux 和 OS X 中：

with open('text_files/filename.txt') as file_object:

这行代码让 Python 到文件夹 python_work 下的文件夹 text_files 中去查找指定的 .txt 文件。在 Windows 系统中，在文件路径中使用反斜杠 () 而不是斜杠 (/)：

with open('text_files\filename.txt') as file_object:

你还可以将文件在计算机中的准确位置告诉 Python，这样就不用关心当前运行的程序存储在什么地方了。这称为绝对文件路径。在相对路径行不通时，可使用绝对路径。例如，如果 text_files 并不在文件夹 python_work 中，而在文件夹 other_files 中，则向 open() 传递路径 ‘text_files/filename.txt’ 行不通，因为 Python 只在文件夹 python_work 中查找该位置。为明确地指出你希望 Python 到哪里去查找，你需要提供完整的路径。
绝对路径通常比相对路径更长，因此将其存储在一个变量中，再将该变量传递给 open() 会有所帮助。在 Linux 和 OS X 中，绝对路径类似于下面这样：

file_path = '/home/ehmatthes/other_files/text_files/filename.txt'
with open(file_path) as file_object:

而在 Windows 系统中，它们类似于下面这样：

file_path = 'C:\Users\ehmatthes\other_files\text_files\filename.txt'
with open(file_path) as file_object:

通过使用绝对路径，可读取系统任何地方的文件。就目前而言，最简单的做法是，要么将数据文件存储在程序文件所在的目录，要么将其存储在程序文件所在目录下的一个文件夹中。
注意 Windows 系统有时能够正确地解读文件路径中的斜杠。如果你使用的是 Windows 系统，且结果不符合预期，请确保在文件路径中使用的是反斜杠。

1.3 逐行读取

读取文件时，常常需要检查其中的每一行：你可能要在文件中查找特定的信息，或者要以某种方式修改文件中的文本。

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_digits.txt'

    with open(filename) as file_object:
        for line in file_object:
            print(line)

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
3.1415926535

  8979323846

  2643383279


Process finished with exit code 0

我们将要读取的文件的名称存储在变量 filename 中，这是使用文件时一种常见的做法。由于变量 filename 表示的并非实际文件，它只是一个让 Python 知道到哪里去查找文件的字符串，因此可轻松地将 ‘pi_digits.txt’ 替换为你要使用的另一个文件的名称。调用 open() 后，将一个表示文件及其内容的对象存储到了变量 file_object 中。这里也使用了关键字 with，让 Python 负责妥善地打开和关闭文件。为查看文件的内容，我们通过对文件对象执行循环来遍历文件中的每一行。
我们打印每一行时，发现空白行更多了。因为在这个文件中，每行的末尾都有一个看不见的换行符，而 print 语句也会加上一个换行符，因此每行末尾都有两个换行符：一个来自文件，另一个来自 print 语句。要消除这些多余的空白行，可在 print 语句中使用 rstrip()：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_digits.txt'

    with open(filename) as file_object:
        for line in file_object:
            print(line.rstrip())

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
3.1415926535
  8979323846
  2643383279

Process finished with exit code 0

1.4 创建一个包含文件各行内容的列表

使用关键字 with 时，open() 返回的文件对象只在 with 代码块内可用。如果要在 with 代码块外访问文件的内容，可在 with 代码块内将文件的各行存储在一个列表中，并在 with 代码块外使用该列表：你可以立即处理文件的各个部分，也可推迟到程序后面再处理。

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_digits.txt'

    with open(filename) as file_object:
        lines = file_object.readlines()

    for line in lines:
        print(line.rstrip())

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
3.1415926535
  8979323846
  2643383279

Process finished with exit code 0

方法 readlines() 从文件中读取每一行，并将其存储在一个列表中；接下来，该列表被存储到变量 lines 中；在 with 代码块外，我们依然可以使用这个变量。
由于列表 lines 的每个元素都对应于文件中的一行，因此输出与文件内容完全一致。

1.5 使用文件的内容

将文件读取到内存中后，就可以以任何方式使用这些数据了。我们将创建一个字符串，它包含文件中存储的所有数字，且没有任何空格：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_digits.txt'

    with open(filename) as file_object:
        lines = file_object.readlines()

    pi_string = ''
    for line in lines:
        pi_string += line.rstrip()

    print(pi_string)
    print(len(pi_string))

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
3.1415926535  8979323846  2643383279
36

Process finished with exit code 0

rstrip() 删除每行末尾的换行符。

删除每行两边的空格，可使用 strip() 而不是 rstrip()。

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_digits.txt'

    with open(filename) as file_object:
        lines = file_object.readlines()

    pi_string = ''
    for line in lines:
        pi_string += line.strip()

    print(pi_string)
    print(len(pi_string))

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
3.141592653589793238462643383279
32

Process finished with exit code 0

注意读取文本文件时，Python 将其中的所有文本都解读为字符串。如果你读取的是数字，并要将其作为数值使用，就必须使用函数 int() 将其转换为整数，或使用函数 float() 将其转换为浮点数。

1.6 包含一百万位的大型文件

我们只打印到小数点后 50 位，以免终端为显示全部 1 000 000 位而不断地翻滚：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_digits.txt'

    with open(filename) as file_object:
        lines = file_object.readlines()

    pi_string = ''
    for line in lines:
        pi_string += line.strip()

    print(pi_string[:52] + "...")
    print(len(pi_string))

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
3.141592653589793238462643383279...
32

Process finished with exit code 0

对于你可处理的数据量，Python 没有任何限制；只要系统的内存足够多，你想处理多少数据都可以。

1.7 圆周率值中包含你的生日吗

确定某个人的生日是否包含在圆周率值的前 1 000 000 位中。可将生日表示为一个由数字组成的字符串，再检查这个字符串是否包含在 pi_string 中：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Yongqiang Cheng

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/..')
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)

if __name__ == '__main__':
    filename = 'pi_million_digits.txt'

    with open(filename) as file_object:
        lines = file_object.readlines()

    pi_string = ''
    for line in lines:
        pi_string += line.strip()

    birthday = input("Enter your birthday, in the form mmddyy: ")
    if birthday in pi_string:
        print("Your birthday appears in the first million digits of pi!")
    else:
        print("Your birthday does not appear in the first million digits of pi.")

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py --gpu=0
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
Enter your birthday, in the form mmddyy: "19880116"
Your birthday does not appear in the first million digits of pi.

Process finished with exit code 0

本文链接：https://blog.csdn.net/chengyq116/article/details/102990068

智能推荐

python学习之从csv文件中读取数据

python学习之将数据写入到csv文件中介绍了如何将内存中数据写入到csv文件中，这节将介绍如何从csv文件中读取数据，读取数据也分为返回的是列表形式和字典形式列表形式按行读取，返回每行内容 csv.reader函数返回的是一个可迭代对象，next(f_read)这步操作是跳过表头，上面的执行结果如下所示字典形式执行结果如下...

Python学习笔记（十四）从文件中读取数据

函数open()接受一个参数：要打开的文件名称。Python在当前指定的文件所在的目录中查找指定的文件。open()函数返回一个一个文件的对象（类似Linux的文件描述符）。在不确定文件关闭的时机时，可以让Python去确定：你只管打开文件，并在需要时使用它，Python自会在合适的时候将其自动关闭。获取到文件的对象之后，使用方法read()读取文件中的全部内容。 1.文件路径上面一个程序打开...

初识Python之从文件中读取数据

自己弄了一个python学习群，感兴趣的可以加准备我们自己在读取文件的时候首先要知道文件的位置,否则也是没有办法将文件读取出来这里涉及到：文件名和路径路径指明了文件在计算机上的位置在不同的操作系统中，路径的表示也有所不同 windows中使用的是反斜杠：E:\Download（在python中需要转义 E:\\Download） linux中使用的是正斜杠：/data/logs/orde...

python从csv文件中读取数据添加到数据库中

示例如下，其中csv是python内置的一个模块，专门用来处理csv文件的读写，不需要再另外下载。 :origin和:destination以及:duration都是占位符，后面的字典表示用o取代origin占位符来执行sql语句 ...

python 连接MySQL数据库从文件中读取数据进行分块自动插入数据

automation_insert_into.py(自动插入数据) conn_mysql(host,port,user,passwd,db) cut_data(file_path,cut_size=5) insert_cut_data(conn,sql,cut_data) 4.automation_insert_into.py...

代码先锋网代码片段及技术文章聚合

Python 从文件中读取数据

Python 从文件中读取数据

1 从文件中读取数据

1.1 读取整个文件

1.2 文件路径

1.3 逐行读取

1.4 创建一个包含文件各行内容的列表

1.5 使用文件的内容

1.6 包含一百万位的大型文件

1.7 圆周率值中包含你的生日吗

智能推荐

python学习之从csv文件中读取数据

Python学习笔记（十四）从文件中读取数据

初识Python之从文件中读取数据

python从csv文件中读取数据添加到数据库中

python 连接MySQL数据库从文件中读取数据进行分块自动插入数据

猜你喜欢

python从Excel中读取数据

python从Excel表格中读取数据

python从excel中读取数据

使用systemd-coredump调试应用程序崩溃

H5移动端实现手机震动效果

相关文章

热门文章

推荐文章

相关标签

代码先锋网 代码片段及技术文章聚合

Python 从文件中读取数据

Python 从文件中读取数据

1 从文件中读取数据

1.1 读取整个文件

1.2 文件路径

1.3 逐行读取

1.4 创建一个包含文件各行内容的列表

1.5 使用文件的内容

1.6 包含一百万位的大型文件

1.7 圆周率值中包含你的生日吗

智能推荐

猜你喜欢

相关文章

热门文章

推荐文章

相关标签

代码先锋网代码片段及技术文章聚合