From: Linus Torvalds Date: Fri, 10 May 2013 16:08:21 +0000 (-0700) Subject: Merge tag 'kvm-3.10-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm X-Git-Tag: firefly_0821_release~3680^2~493 X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=c67723ebbb2d6f672a0e9e5b1a8d1a2442942557;hp=326f578f7e1443bac2333712dd130a261ec15288;p=firefly-linux-kernel-4.4.55.git Merge tag 'kvm-3.10-2' of git://git./virt/kvm/kvm Pull kvm fixes from Gleb Natapov: "Most of the fixes are in the emulator since now we emulate more than we did before for correctness sake we see more bugs there, but there is also an OOPS fixed and corruption of xcr0 register." * tag 'kvm-3.10-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: emulator: emulate SALC KVM: emulator: emulate XLAT KVM: emulator: emulate AAM KVM: VMX: fix halt emulation while emulating invalid guest sate KVM: Fix kvm_irqfd_init initialization KVM: x86: fix maintenance of guest/host xcr0 state --- diff --git a/Documentation/ABI/testing/sysfs-block-bcache b/Documentation/ABI/testing/sysfs-block-bcache new file mode 100644 index 000000000000..9e4bbc5d51fd --- /dev/null +++ b/Documentation/ABI/testing/sysfs-block-bcache @@ -0,0 +1,156 @@ +What: /sys/block//bcache/unregister +Date: November 2010 +Contact: Kent Overstreet +Description: + A write to this file causes the backing device or cache to be + unregistered. If a backing device had dirty data in the cache, + writeback mode is automatically disabled and all dirty data is + flushed before the device is unregistered. Caches unregister + all associated backing devices before unregistering themselves. + +What: /sys/block//bcache/clear_stats +Date: November 2010 +Contact: Kent Overstreet +Description: + Writing to this file resets all the statistics for the device. + +What: /sys/block//bcache/cache +Date: November 2010 +Contact: Kent Overstreet +Description: + For a backing device that has cache, a symlink to + the bcache/ dir of that cache. + +What: /sys/block//bcache/cache_hits +Date: November 2010 +Contact: Kent Overstreet +Description: + For backing devices: integer number of full cache hits, + counted per bio. A partial cache hit counts as a miss. + +What: /sys/block//bcache/cache_misses +Date: November 2010 +Contact: Kent Overstreet +Description: + For backing devices: integer number of cache misses. + +What: /sys/block//bcache/cache_hit_ratio +Date: November 2010 +Contact: Kent Overstreet +Description: + For backing devices: cache hits as a percentage. + +What: /sys/block//bcache/sequential_cutoff +Date: November 2010 +Contact: Kent Overstreet +Description: + For backing devices: Threshold past which sequential IO will + skip the cache. Read and written as bytes in human readable + units (i.e. echo 10M > sequntial_cutoff). + +What: /sys/block//bcache/bypassed +Date: November 2010 +Contact: Kent Overstreet +Description: + Sum of all reads and writes that have bypassed the cache (due + to the sequential cutoff). Expressed as bytes in human + readable units. + +What: /sys/block//bcache/writeback +Date: November 2010 +Contact: Kent Overstreet +Description: + For backing devices: When on, writeback caching is enabled and + writes will be buffered in the cache. When off, caching is in + writethrough mode; reads and writes will be added to the + cache but no write buffering will take place. + +What: /sys/block//bcache/writeback_running +Date: November 2010 +Contact: Kent Overstreet +Description: + For backing devices: when off, dirty data will not be written + from the cache to the backing device. The cache will still be + used to buffer writes until it is mostly full, at which point + writes transparently revert to writethrough mode. Intended only + for benchmarking/testing. + +What: /sys/block//bcache/writeback_delay +Date: November 2010 +Contact: Kent Overstreet +Description: + For backing devices: In writeback mode, when dirty data is + written to the cache and the cache held no dirty data for that + backing device, writeback from cache to backing device starts + after this delay, expressed as an integer number of seconds. + +What: /sys/block//bcache/writeback_percent +Date: November 2010 +Contact: Kent Overstreet +Description: + For backing devices: If nonzero, writeback from cache to + backing device only takes place when more than this percentage + of the cache is used, allowing more write coalescing to take + place and reducing total number of writes sent to the backing + device. Integer between 0 and 40. + +What: /sys/block//bcache/synchronous +Date: November 2010 +Contact: Kent Overstreet +Description: + For a cache, a boolean that allows synchronous mode to be + switched on and off. In synchronous mode all writes are ordered + such that the cache can reliably recover from unclean shutdown; + if disabled bcache will not generally wait for writes to + complete but if the cache is not shut down cleanly all data + will be discarded from the cache. Should not be turned off with + writeback caching enabled. + +What: /sys/block//bcache/discard +Date: November 2010 +Contact: Kent Overstreet +Description: + For a cache, a boolean allowing discard/TRIM to be turned off + or back on if the device supports it. + +What: /sys/block//bcache/bucket_size +Date: November 2010 +Contact: Kent Overstreet +Description: + For a cache, bucket size in human readable units, as set at + cache creation time; should match the erase block size of the + SSD for optimal performance. + +What: /sys/block//bcache/nbuckets +Date: November 2010 +Contact: Kent Overstreet +Description: + For a cache, the number of usable buckets. + +What: /sys/block//bcache/tree_depth +Date: November 2010 +Contact: Kent Overstreet +Description: + For a cache, height of the btree excluding leaf nodes (i.e. a + one node tree will have a depth of 0). + +What: /sys/block//bcache/btree_cache_size +Date: November 2010 +Contact: Kent Overstreet +Description: + Number of btree buckets/nodes that are currently cached in + memory; cache dynamically grows and shrinks in response to + memory pressure from the rest of the system. + +What: /sys/block//bcache/written +Date: November 2010 +Contact: Kent Overstreet +Description: + For a cache, total amount of data in human readable units + written to the cache, excluding all metadata. + +What: /sys/block//bcache/btree_written +Date: November 2010 +Contact: Kent Overstreet +Description: + For a cache, sum of all btree writes in human readable units. diff --git a/Documentation/ABI/testing/sysfs-class-mtd b/Documentation/ABI/testing/sysfs-class-mtd index 938ef71e2035..3105644b3bfc 100644 --- a/Documentation/ABI/testing/sysfs-class-mtd +++ b/Documentation/ABI/testing/sysfs-class-mtd @@ -14,8 +14,7 @@ Description: The /sys/class/mtd/mtd{0,1,2,3,...} directories correspond to each /dev/mtdX character device. These may represent physical/simulated flash devices, partitions on a flash - device, or concatenated flash devices. They exist regardless - of whether CONFIG_MTD_CHAR is actually enabled. + device, or concatenated flash devices. What: /sys/class/mtd/mtdXro/ Date: April 2009 @@ -23,8 +22,7 @@ KernelVersion: 2.6.29 Contact: linux-mtd@lists.infradead.org Description: These directories provide the corresponding read-only device - nodes for /sys/class/mtd/mtdX/ . They are only created - (for the benefit of udev) if CONFIG_MTD_CHAR is enabled. + nodes for /sys/class/mtd/mtdX/ . What: /sys/class/mtd/mtdX/dev Date: April 2009 diff --git a/Documentation/acpi/enumeration.txt b/Documentation/acpi/enumeration.txt index b0d541042ac6..d9be7a97dff3 100644 --- a/Documentation/acpi/enumeration.txt +++ b/Documentation/acpi/enumeration.txt @@ -66,6 +66,83 @@ the ACPI device explicitly to acpi_platform_device_ids list defined in drivers/acpi/acpi_platform.c. This limitation is only for the platform devices, SPI and I2C devices are created automatically as described below. +DMA support +~~~~~~~~~~~ +DMA controllers enumerated via ACPI should be registered in the system to +provide generic access to their resources. For example, a driver that would +like to be accessible to slave devices via generic API call +dma_request_slave_channel() must register itself at the end of the probe +function like this: + + err = devm_acpi_dma_controller_register(dev, xlate_func, dw); + /* Handle the error if it's not a case of !CONFIG_ACPI */ + +and implement custom xlate function if needed (usually acpi_dma_simple_xlate() +is enough) which converts the FixedDMA resource provided by struct +acpi_dma_spec into the corresponding DMA channel. A piece of code for that case +could look like: + + #ifdef CONFIG_ACPI + struct filter_args { + /* Provide necessary information for the filter_func */ + ... + }; + + static bool filter_func(struct dma_chan *chan, void *param) + { + /* Choose the proper channel */ + ... + } + + static struct dma_chan *xlate_func(struct acpi_dma_spec *dma_spec, + struct acpi_dma *adma) + { + dma_cap_mask_t cap; + struct filter_args args; + + /* Prepare arguments for filter_func */ + ... + return dma_request_channel(cap, filter_func, &args); + } + #else + static struct dma_chan *xlate_func(struct acpi_dma_spec *dma_spec, + struct acpi_dma *adma) + { + return NULL; + } + #endif + +dma_request_slave_channel() will call xlate_func() for each registered DMA +controller. In the xlate function the proper channel must be chosen based on +information in struct acpi_dma_spec and the properties of the controller +provided by struct acpi_dma. + +Clients must call dma_request_slave_channel() with the string parameter that +corresponds to a specific FixedDMA resource. By default "tx" means the first +entry of the FixedDMA resource array, "rx" means the second entry. The table +below shows a layout: + + Device (I2C0) + { + ... + Method (_CRS, 0, NotSerialized) + { + Name (DBUF, ResourceTemplate () + { + FixedDMA (0x0018, 0x0004, Width32bit, _Y48) + FixedDMA (0x0019, 0x0005, Width32bit, ) + }) + ... + } + } + +So, the FixedDMA with request line 0x0018 is "tx" and next one is "rx" in +this example. + +In robust cases the client unfortunately needs to call +acpi_dma_request_slave_chan_by_index() directly and therefore choose the +specific FixedDMA resource by its index. + SPI serial bus support ~~~~~~~~~~~~~~~~~~~~~~ Slave devices behind SPI bus have SpiSerialBus resource attached to them. diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt new file mode 100644 index 000000000000..77db8809bd96 --- /dev/null +++ b/Documentation/bcache.txt @@ -0,0 +1,431 @@ +Say you've got a big slow raid 6, and an X-25E or three. Wouldn't it be +nice if you could use them as cache... Hence bcache. + +Wiki and git repositories are at: + http://bcache.evilpiepirate.org + http://evilpiepirate.org/git/linux-bcache.git + http://evilpiepirate.org/git/bcache-tools.git + +It's designed around the performance characteristics of SSDs - it only allocates +in erase block sized buckets, and it uses a hybrid btree/log to track cached +extants (which can be anywhere from a single sector to the bucket size). It's +designed to avoid random writes at all costs; it fills up an erase block +sequentially, then issues a discard before reusing it. + +Both writethrough and writeback caching are supported. Writeback defaults to +off, but can be switched on and off arbitrarily at runtime. Bcache goes to +great lengths to protect your data - it reliably handles unclean shutdown. (It +doesn't even have a notion of a clean shutdown; bcache simply doesn't return +writes as completed until they're on stable storage). + +Writeback caching can use most of the cache for buffering writes - writing +dirty data to the backing device is always done sequentially, scanning from the +start to the end of the index. + +Since random IO is what SSDs excel at, there generally won't be much benefit +to caching large sequential IO. Bcache detects sequential IO and skips it; +it also keeps a rolling average of the IO sizes per task, and as long as the +average is above the cutoff it will skip all IO from that task - instead of +caching the first 512k after every seek. Backups and large file copies should +thus entirely bypass the cache. + +In the event of a data IO error on the flash it will try to recover by reading +from disk or invalidating cache entries. For unrecoverable errors (meta data +or dirty data), caching is automatically disabled; if dirty data was present +in the cache it first disables writeback caching and waits for all dirty data +to be flushed. + +Getting started: +You'll need make-bcache from the bcache-tools repository. Both the cache device +and backing device must be formatted before use. + make-bcache -B /dev/sdb + make-bcache -C /dev/sdc + +make-bcache has the ability to format multiple devices at the same time - if +you format your backing devices and cache device at the same time, you won't +have to manually attach: + make-bcache -B /dev/sda /dev/sdb -C /dev/sdc + +To make bcache devices known to the kernel, echo them to /sys/fs/bcache/register: + + echo /dev/sdb > /sys/fs/bcache/register + echo /dev/sdc > /sys/fs/bcache/register + +To register your bcache devices automatically, you could add something like +this to an init script: + + echo /dev/sd* > /sys/fs/bcache/register_quiet + +It'll look for bcache superblocks and ignore everything that doesn't have one. + +Registering the backing device makes the bcache show up in /dev; you can now +format it and use it as normal. But the first time using a new bcache device, +it'll be running in passthrough mode until you attach it to a cache. See the +section on attaching. + +The devices show up at /dev/bcacheN, and can be controlled via sysfs from +/sys/block/bcacheN/bcache: + + mkfs.ext4 /dev/bcache0 + mount /dev/bcache0 /mnt + +Cache devices are managed as sets; multiple caches per set isn't supported yet +but will allow for mirroring of metadata and dirty data in the future. Your new +cache set shows up as /sys/fs/bcache/ + +ATTACHING: + +After your cache device and backing device are registered, the backing device +must be attached to your cache set to enable caching. Attaching a backing +device to a cache set is done thusly, with the UUID of the cache set in +/sys/fs/bcache: + + echo > /sys/block/bcache0/bcache/attach + +This only has to be done once. The next time you reboot, just reregister all +your bcache devices. If a backing device has data in a cache somewhere, the +/dev/bcache# device won't be created until the cache shows up - particularly +important if you have writeback caching turned on. + +If you're booting up and your cache device is gone and never coming back, you +can force run the backing device: + + echo 1 > /sys/block/sdb/bcache/running + +(You need to use /sys/block/sdb (or whatever your backing device is called), not +/sys/block/bcache0, because bcache0 doesn't exist yet. If you're using a +partition, the bcache directory would be at /sys/block/sdb/sdb2/bcache) + +The backing device will still use that cache set if it shows up in the future, +but all the cached data will be invalidated. If there was dirty data in the +cache, don't expect the filesystem to be recoverable - you will have massive +filesystem corruption, though ext4's fsck does work miracles. + +ERROR HANDLING: + +Bcache tries to transparently handle IO errors to/from the cache device without +affecting normal operation; if it sees too many errors (the threshold is +configurable, and defaults to 0) it shuts down the cache device and switches all +the backing devices to passthrough mode. + + - For reads from the cache, if they error we just retry the read from the + backing device. + + - For writethrough writes, if the write to the cache errors we just switch to + invalidating the data at that lba in the cache (i.e. the same thing we do for + a write that bypasses the cache) + + - For writeback writes, we currently pass that error back up to the + filesystem/userspace. This could be improved - we could retry it as a write + that skips the cache so we don't have to error the write. + + - When we detach, we first try to flush any dirty data (if we were running in + writeback mode). It currently doesn't do anything intelligent if it fails to + read some of the dirty data, though. + +TROUBLESHOOTING PERFORMANCE: + +Bcache has a bunch of config options and tunables. The defaults are intended to +be reasonable for typical desktop and server workloads, but they're not what you +want for getting the best possible numbers when benchmarking. + + - Bad write performance + + If write performance is not what you expected, you probably wanted to be + running in writeback mode, which isn't the default (not due to a lack of + maturity, but simply because in writeback mode you'll lose data if something + happens to your SSD) + + # echo writeback > /sys/block/bcache0/cache_mode + + - Bad performance, or traffic not going to the SSD that you'd expect + + By default, bcache doesn't cache everything. It tries to skip sequential IO - + because you really want to be caching the random IO, and if you copy a 10 + gigabyte file you probably don't want that pushing 10 gigabytes of randomly + accessed data out of your cache. + + But if you want to benchmark reads from cache, and you start out with fio + writing an 8 gigabyte test file - so you want to disable that. + + # echo 0 > /sys/block/bcache0/bcache/sequential_cutoff + + To set it back to the default (4 mb), do + + # echo 4M > /sys/block/bcache0/bcache/sequential_cutoff + + - Traffic's still going to the spindle/still getting cache misses + + In the real world, SSDs don't always keep up with disks - particularly with + slower SSDs, many disks being cached by one SSD, or mostly sequential IO. So + you want to avoid being bottlenecked by the SSD and having it slow everything + down. + + To avoid that bcache tracks latency to the cache device, and gradually + throttles traffic if the latency exceeds a threshold (it does this by + cranking down the sequential bypass). + + You can disable this if you need to by setting the thresholds to 0: + + # echo 0 > /sys/fs/bcache//congested_read_threshold_us + # echo 0 > /sys/fs/bcache//congested_write_threshold_us + + The default is 2000 us (2 milliseconds) for reads, and 20000 for writes. + + - Still getting cache misses, of the same data + + One last issue that sometimes trips people up is actually an old bug, due to + the way cache coherency is handled for cache misses. If a btree node is full, + a cache miss won't be able to insert a key for the new data and the data + won't be written to the cache. + + In practice this isn't an issue because as soon as a write comes along it'll + cause the btree node to be split, and you need almost no write traffic for + this to not show up enough to be noticable (especially since bcache's btree + nodes are huge and index large regions of the device). But when you're + benchmarking, if you're trying to warm the cache by reading a bunch of data + and there's no other traffic - that can be a problem. + + Solution: warm the cache by doing writes, or use the testing branch (there's + a fix for the issue there). + +SYSFS - BACKING DEVICE: + +attach + Echo the UUID of a cache set to this file to enable caching. + +cache_mode + Can be one of either writethrough, writeback, writearound or none. + +clear_stats + Writing to this file resets the running total stats (not the day/hour/5 minute + decaying versions). + +detach + Write to this file to detach from a cache set. If there is dirty data in the + cache, it will be flushed first. + +dirty_data + Amount of dirty data for this backing device in the cache. Continuously + updated unlike the cache set's version, but may be slightly off. + +label + Name of underlying device. + +readahead + Size of readahead that should be performed. Defaults to 0. If set to e.g. + 1M, it will round cache miss reads up to that size, but without overlapping + existing cache entries. + +running + 1 if bcache is running (i.e. whether the /dev/bcache device exists, whether + it's in passthrough mode or caching). + +sequential_cutoff + A sequential IO will bypass the cache once it passes this threshhold; the + most recent 128 IOs are tracked so sequential IO can be detected even when + it isn't all done at once. + +sequential_merge + If non zero, bcache keeps a list of the last 128 requests submitted to compare + against all new requests to determine which new requests are sequential + continuations of previous requests for the purpose of determining sequential + cutoff. This is necessary if the sequential cutoff value is greater than the + maximum acceptable sequential size for any single request. + +state + The backing device can be in one of four different states: + + no cache: Has never been attached to a cache set. + + clean: Part of a cache set, and there is no cached dirty data. + + dirty: Part of a cache set, and there is cached dirty data. + + inconsistent: The backing device was forcibly run by the user when there was + dirty data cached but the cache set was unavailable; whatever data was on the + backing device has likely been corrupted. + +stop + Write to this file to shut down the bcache device and close the backing + device. + +writeback_delay + When dirty data is written to the cache and it previously did not contain + any, waits some number of seconds before initiating writeback. Defaults to + 30. + +writeback_percent + If nonzero, bcache tries to keep around this percentage of the cache dirty by + throttling background writeback and using a PD controller to smoothly adjust + the rate. + +writeback_rate + Rate in sectors per second - if writeback_percent is nonzero, background + writeback is throttled to this rate. Continuously adjusted by bcache but may + also be set by the user. + +writeback_running + If off, writeback of dirty data will not take place at all. Dirty data will + still be added to the cache until it is mostly full; only meant for + benchmarking. Defaults to on. + +SYSFS - BACKING DEVICE STATS: + +There are directories with these numbers for a running total, as well as +versions that decay over the past day, hour and 5 minutes; they're also +aggregated in the cache set directory as well. + +bypassed + Amount of IO (both reads and writes) that has bypassed the cache + +cache_hits +cache_misses +cache_hit_ratio + Hits and misses are counted per individual IO as bcache sees them; a + partial hit is counted as a miss. + +cache_bypass_hits +cache_bypass_misses + Hits and misses for IO that is intended to skip the cache are still counted, + but broken out here. + +cache_miss_collisions + Counts instances where data was going to be inserted into the cache from a + cache miss, but raced with a write and data was already present (usually 0 + since the synchronization for cache misses was rewritten) + +cache_readaheads + Count of times readahead occured. + +SYSFS - CACHE SET: + +average_key_size + Average data per key in the btree. + +bdev<0..n> + Symlink to each of the attached backing devices. + +block_size + Block size of the cache devices. + +btree_cache_size + Amount of memory currently used by the btree cache + +bucket_size + Size of buckets + +cache<0..n> + Symlink to each of the cache devices comprising this cache set. + +cache_available_percent + Percentage of cache device free. + +clear_stats + Clears the statistics associated with this cache + +dirty_data + Amount of dirty data is in the cache (updated when garbage collection runs). + +flash_vol_create + Echoing a size to this file (in human readable units, k/M/G) creates a thinly + provisioned volume backed by the cache set. + +io_error_halflife +io_error_limit + These determines how many errors we accept before disabling the cache. + Each error is decayed by the half life (in # ios). If the decaying count + reaches io_error_limit dirty data is written out and the cache is disabled. + +journal_delay_ms + Journal writes will delay for up to this many milliseconds, unless a cache + flush happens sooner. Defaults to 100. + +root_usage_percent + Percentage of the root btree node in use. If this gets too high the node + will split, increasing the tree depth. + +stop + Write to this file to shut down the cache set - waits until all attached + backing devices have been shut down. + +tree_depth + Depth of the btree (A single node btree has depth 0). + +unregister + Detaches all backing devices and closes the cache devices; if dirty data is + present it will disable writeback caching and wait for it to be flushed. + +SYSFS - CACHE SET INTERNAL: + +This directory also exposes timings for a number of internal operations, with +separate files for average duration, average frequency, last occurence and max +duration: garbage collection, btree read, btree node sorts and btree splits. + +active_journal_entries + Number of journal entries that are newer than the index. + +btree_nodes + Total nodes in the btree. + +btree_used_percent + Average fraction of btree in use. + +bset_tree_stats + Statistics about the auxiliary search trees + +btree_cache_max_chain + Longest chain in the btree node cache's hash table + +cache_read_races + Counts instances where while data was being read from the cache, the bucket + was reused and invalidated - i.e. where the pointer was stale after the read + completed. When this occurs the data is reread from the backing device. + +trigger_gc + Writing to this file forces garbage collection to run. + +SYSFS - CACHE DEVICE: + +block_size + Minimum granularity of writes - should match hardware sector size. + +btree_written + Sum of all btree writes, in (kilo/mega/giga) bytes + +bucket_size + Size of buckets + +cache_replacement_policy + One of either lru, fifo or random. + +discard + Boolean; if on a discard/TRIM will be issued to each bucket before it is + reused. Defaults to off, since SATA TRIM is an unqueued command (and thus + slow). + +freelist_percent + Size of the freelist as a percentage of nbuckets. Can be written to to + increase the number of buckets kept on the freelist, which lets you + artificially reduce the size of the cache at runtime. Mostly for testing + purposes (i.e. testing how different size caches affect your hit rate), but + since buckets are discarded when they move on to the freelist will also make + the SSD's garbage collection easier by effectively giving it more reserved + space. + +io_errors + Number of errors that have occured, decayed by io_error_halflife. + +metadata_written + Sum of all non data writes (btree writes and all other metadata). + +nbuckets + Total buckets in this cache + +priority_stats + Statistics about how recently data in the cache has been accessed. This can + reveal your working set size. + +written + Sum of all data that has been written to the cache; comparison with + btree_written gives the amount of write inflation in bcache. diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt index a5eb7d19a65d..9887f0414c16 100644 --- a/Documentation/block/cfq-iosched.txt +++ b/Documentation/block/cfq-iosched.txt @@ -5,7 +5,7 @@ The main aim of CFQ scheduler is to provide a fair allocation of the disk I/O bandwidth for all the processes which requests an I/O operation. CFQ maintains the per process queue for the processes which request I/O -operation(syncronous requests). In case of asynchronous requests, all the +operation(synchronous requests). In case of asynchronous requests, all the requests from all the processes are batched together according to their process's I/O priority. @@ -66,6 +66,47 @@ This parameter is used to set the timeout of synchronous requests. Default value of this is 124ms. In case to favor synchronous requests over asynchronous one, this value should be decreased relative to fifo_expire_async. +group_idle +----------- +This parameter forces idling at the CFQ group level instead of CFQ +queue level. This was introduced after after a bottleneck was observed +in higher end storage due to idle on sequential queue and allow dispatch +from a single queue. The idea with this parameter is that it can be run with +slice_idle=0 and group_idle=8, so that idling does not happen on individual +queues in the group but happens overall on the group and thus still keeps the +IO controller working. +Not idling on individual queues in the group will dispatch requests from +multiple queues in the group at the same time and achieve higher throughput +on higher end storage. + +Default value for this parameter is 8ms. + +latency +------- +This parameter is used to enable/disable the latency mode of the CFQ +scheduler. If latency mode (called low_latency) is enabled, CFQ tries +to recompute the slice time for each process based on the target_latency set +for the system. This favors fairness over throughput. Disabling low +latency (setting it to 0) ignores target latency, allowing each process in the +system to get a full time slice. + +By default low latency mode is enabled. + +target_latency +-------------- +This parameter is used to calculate the time slice for a process if cfq's +latency mode is enabled. It will ensure that sync requests have an estimated +latency. But if sequential workload is higher(e.g. sequential read), +then to meet the latency constraints, throughput may decrease because of less +time for each process to issue I/O request before the cfq queue is switched. + +Though this can be overcome by disabling the latency_mode, it may increase +the read latency for some applications. This parameter allows for changing +target_latency through the sysfs interface which can provide the balanced +throughput and read latency. + +Default value for target_latency is 300ms. + slice_async ----------- This parameter is same as of slice_sync but for asynchronous queue. The @@ -98,8 +139,8 @@ in the device exceeds this parameter. This parameter is used for synchronous request. In case of storage with several disk, this setting can limit the parallel -processing of request. Therefore, increasing the value can imporve the -performace although this can cause the latency of some I/O to increase due +processing of request. Therefore, increasing the value can improve the +performance although this can cause the latency of some I/O to increase due to more number of requests. CFQ Group scheduling diff --git a/Documentation/devicetree/bindings/dma/atmel-dma.txt b/Documentation/devicetree/bindings/dma/atmel-dma.txt index 3c046ee6e8b5..c80e8a3402f0 100644 --- a/Documentation/devicetree/bindings/dma/atmel-dma.txt +++ b/Documentation/devicetree/bindings/dma/atmel-dma.txt @@ -1,14 +1,39 @@ * Atmel Direct Memory Access Controller (DMA) Required properties: -- compatible: Should be "atmel,-dma" -- reg: Should contain DMA registers location and length -- interrupts: Should contain DMA interrupt +- compatible: Should be "atmel,-dma". +- reg: Should contain DMA registers location and length. +- interrupts: Should contain DMA interrupt. +- #dma-cells: Must be <2>, used to represent the number of integer cells in +the dmas property of client devices. -Examples: +Example: -dma@ffffec00 { +dma0: dma@ffffec00 { compatible = "atmel,at91sam9g45-dma"; reg = <0xffffec00 0x200>; interrupts = <21>; + #dma-cells = <2>; +}; + +DMA clients connected to the Atmel DMA controller must use the format +described in the dma.txt file, using a three-cell specifier for each channel: +a phandle plus two interger cells. +The three cells in order are: + +1. A phandle pointing to the DMA controller. +2. The memory interface (16 most significant bits), the peripheral interface +(16 less significant bits). +3. The peripheral identifier for the hardware handshaking interface. The +identifier can be different for tx and rx. + +Example: + +i2c0@i2c@f8010000 { + compatible = "atmel,at91sam9x5-i2c"; + reg = <0xf8010000 0x100>; + interrupts = <9 4 6>; + dmas = <&dma0 1 7>, + <&dma0 1 8>; + dma-names = "tx", "rx"; }; diff --git a/Documentation/devicetree/bindings/mips/ralink.txt b/Documentation/devicetree/bindings/mips/ralink.txt new file mode 100644 index 000000000000..b35a8d04f8b6 --- /dev/null +++ b/Documentation/devicetree/bindings/mips/ralink.txt @@ -0,0 +1,17 @@ +Ralink MIPS SoC device tree bindings + +1. SoCs + +Each device tree must specify a compatible value for the Ralink SoC +it uses in the compatible property of the root node. The compatible +value must be one of the following values: + + ralink,rt2880-soc + ralink,rt3050-soc + ralink,rt3052-soc + ralink,rt3350-soc + ralink,rt3352-soc + ralink,rt3883-soc + ralink,rt5350-soc + ralink,mt7620a-soc + ralink,mt7620n-soc diff --git a/Documentation/devicetree/bindings/mtd/gpmc-nand.txt b/Documentation/devicetree/bindings/mtd/gpmc-nand.txt index e7f8d7ed47eb..6a983c1d87cd 100644 --- a/Documentation/devicetree/bindings/mtd/gpmc-nand.txt +++ b/Documentation/devicetree/bindings/mtd/gpmc-nand.txt @@ -56,20 +56,20 @@ Example for an AM33xx board: nand-bus-width = <16>; ti,nand-ecc-opt = "bch8"; - gpmc,sync-clk = <0>; - gpmc,cs-on = <0>; - gpmc,cs-rd-off = <44>; - gpmc,cs-wr-off = <44>; - gpmc,adv-on = <6>; - gpmc,adv-rd-off = <34>; - gpmc,adv-wr-off = <44>; - gpmc,we-off = <40>; - gpmc,oe-off = <54>; - gpmc,access = <64>; - gpmc,rd-cycle = <82>; - gpmc,wr-cycle = <82>; - gpmc,wr-access = <40>; - gpmc,wr-data-mux-bus = <0>; + gpmc,sync-clk-ps = <0>; + gpmc,cs-on-ns = <0>; + gpmc,cs-rd-off-ns = <44>; + gpmc,cs-wr-off-ns = <44>; + gpmc,adv-on-ns = <6>; + gpmc,adv-rd-off-ns = <34>; + gpmc,adv-wr-off-ns = <44>; + gpmc,we-off-ns = <40>; + gpmc,oe-off-ns = <54>; + gpmc,access-ns = <64>; + gpmc,rd-cycle-ns = <82>; + gpmc,wr-cycle-ns = <82>; + gpmc,wr-access-ns = <40>; + gpmc,wr-data-mux-bus-ns = <0>; #address-cells = <1>; #size-cells = <1>; diff --git a/Documentation/devicetree/bindings/mtd/partition.txt b/Documentation/devicetree/bindings/mtd/partition.txt index 6e1f61f1e789..9315ac96b49b 100644 --- a/Documentation/devicetree/bindings/mtd/partition.txt +++ b/Documentation/devicetree/bindings/mtd/partition.txt @@ -5,8 +5,12 @@ on platforms which have strong conventions about which portions of a flash are used for what purposes, but which don't use an on-flash partition table such as RedBoot. -#address-cells & #size-cells must both be present in the mtd device and be -equal to 1. +#address-cells & #size-cells must both be present in the mtd device. There are +two valid values for both: +<1>: for partitions that require a single 32-bit cell to represent their + size/address (aka the value is below 4 GiB) +<2>: for partitions that require two 32-bit cells to represent their + size/address (aka the value is 4 GiB or greater). Required properties: - reg : The partition's offset and size within the mtd bank. @@ -36,3 +40,31 @@ flash@0 { reg = <0x0100000 0x200000>; }; }; + +flash@1 { + #address-cells = <1>; + #size-cells = <2>; + + /* a 4 GiB partition */ + partition@0 { + label = "filesystem"; + reg = <0x00000000 0x1 0x00000000>; + }; +}; + +flash@2 { + #address-cells = <2>; + #size-cells = <2>; + + /* an 8 GiB partition */ + partition@0 { + label = "filesystem #1"; + reg = <0x0 0x00000000 0x2 0x00000000>; + }; + + /* a 4 GiB partition */ + partition@200000000 { + label = "filesystem #2"; + reg = <0x2 0x00000000 0x1 0x00000000>; + }; +}; diff --git a/Documentation/devicetree/bindings/net/gpmc-eth.txt b/Documentation/devicetree/bindings/net/gpmc-eth.txt index 24cb4e46f675..ace4a64b3695 100644 --- a/Documentation/devicetree/bindings/net/gpmc-eth.txt +++ b/Documentation/devicetree/bindings/net/gpmc-eth.txt @@ -26,16 +26,16 @@ Required properties: - bank-width: Address width of the device in bytes. GPMC supports 8-bit and 16-bit devices and so must be either 1 or 2 bytes. - compatible: Compatible string property for the ethernet child device. -- gpmc,cs-on: Chip-select assertion time -- gpmc,cs-rd-off: Chip-select de-assertion time for reads -- gpmc,cs-wr-off: Chip-select de-assertion time for writes -- gpmc,oe-on: Output-enable assertion time -- gpmc,oe-off Output-enable de-assertion time -- gpmc,we-on: Write-enable assertion time -- gpmc,we-off: Write-enable de-assertion time -- gpmc,access: Start cycle to first data capture (read access) -- gpmc,rd-cycle: Total read cycle time -- gpmc,wr-cycle: Total write cycle time +- gpmc,cs-on-ns: Chip-select assertion time +- gpmc,cs-rd-off-ns: Chip-select de-assertion time for reads +- gpmc,cs-wr-off-ns: Chip-select de-assertion time for writes +- gpmc,oe-on-ns: Output-enable assertion time +- gpmc,oe-off-ns: Output-enable de-assertion time +- gpmc,we-on-ns: Write-enable assertion time +- gpmc,we-off-ns: Write-enable de-assertion time +- gpmc,access-ns: Start cycle to first data capture (read access) +- gpmc,rd-cycle-ns: Total read cycle time +- gpmc,wr-cycle-ns: Total write cycle time - reg: Chip-select, base address (relative to chip-select) and size of the memory mapped for the device. Note that base address will be typically 0 as this @@ -65,24 +65,24 @@ gpmc: gpmc@6e000000 { bank-width = <2>; gpmc,mux-add-data; - gpmc,cs-on = <0>; - gpmc,cs-rd-off = <186>; - gpmc,cs-wr-off = <186>; - gpmc,adv-on = <12>; - gpmc,adv-rd-off = <48>; - gpmc,adv-wr-off = <48>; - gpmc,oe-on = <54>; - gpmc,oe-off = <168>; - gpmc,we-on = <54>; - gpmc,we-off = <168>; - gpmc,rd-cycle = <186>; - gpmc,wr-cycle = <186>; - gpmc,access = <114>; - gpmc,page-burst-access = <6>; - gpmc,bus-turnaround = <12>; - gpmc,cycle2cycle-delay = <18>; - gpmc,wr-data-mux-bus = <90>; - gpmc,wr-access = <186>; + gpmc,cs-on-ns = <0>; + gpmc,cs-rd-off-ns = <186>; + gpmc,cs-wr-off-ns = <186>; + gpmc,adv-on-ns = <12>; + gpmc,adv-rd-off-ns = <48>; + gpmc,adv-wr-off-ns = <48>; + gpmc,oe-on-ns = <54>; + gpmc,oe-off-ns = <168>; + gpmc,we-on-ns = <54>; + gpmc,we-off-ns = <168>; + gpmc,rd-cycle-ns = <186>; + gpmc,wr-cycle-ns = <186>; + gpmc,access-ns = <114>; + gpmc,page-burst-access-ns = <6>; + gpmc,bus-turnaround-ns = <12>; + gpmc,cycle2cycle-delay-ns = <18>; + gpmc,wr-data-mux-bus-ns = <90>; + gpmc,wr-access-ns = <186>; gpmc,cycle2cycle-samecsen; gpmc,cycle2cycle-diffcsen; diff --git a/Documentation/devicetree/bindings/thermal/armada-thermal.txt b/Documentation/devicetree/bindings/thermal/armada-thermal.txt new file mode 100644 index 000000000000..fff93d5f92de --- /dev/null +++ b/Documentation/devicetree/bindings/thermal/armada-thermal.txt @@ -0,0 +1,22 @@ +* Marvell Armada 370/XP thermal management + +Required properties: + +- compatible: Should be set to one of the following: + marvell,armada370-thermal + marvell,armadaxp-thermal + +- reg: Device's register space. + Two entries are expected, see the examples below. + The first one is required for the sensor register; + the second one is required for the control register + to be used for sensor initialization (a.k.a. calibration). + +Example: + + thermal@d0018300 { + compatible = "marvell,armada370-thermal"; + reg = <0xd0018300 0x4 + 0xd0018304 0x4>; + status = "okay"; + }; diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index 4d1919bf2332..6931c4348d24 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -42,6 +42,7 @@ onnn ON Semiconductor Corp. picochip Picochip Ltd powervr PowerVR (deprecated, use img) qcom Qualcomm, Inc. +ralink Mediatek/Ralink Technology Corp. ramtron Ramtron International realtek Realtek Semiconductor Corp. renesas Renesas Electronics Corporation diff --git a/Documentation/dmatest.txt b/Documentation/dmatest.txt new file mode 100644 index 000000000000..279ac0a8c5b1 --- /dev/null +++ b/Documentation/dmatest.txt @@ -0,0 +1,81 @@ + DMA Test Guide + ============== + + Andy Shevchenko + +This small document introduces how to test DMA drivers using dmatest module. + + Part 1 - How to build the test module + +The menuconfig contains an option that could be found by following path: + Device Drivers -> DMA Engine support -> DMA Test client + +In the configuration file the option called CONFIG_DMATEST. The dmatest could +be built as module or inside kernel. Let's consider those cases. + + Part 2 - When dmatest is built as a module... + +After mounting debugfs and loading the module, the /sys/kernel/debug/dmatest +folder with nodes will be created. They are the same as module parameters with +addition of the 'run' node that controls run and stop phases of the test. + +Note that in this case test will not run on load automatically. + +Example of usage: + % echo dma0chan0 > /sys/kernel/debug/dmatest/channel + % echo 2000 > /sys/kernel/debug/dmatest/timeout + % echo 1 > /sys/kernel/debug/dmatest/iterations + % echo 1 > /sys/kernel/debug/dmatest/run + +Hint: available channel list could be extracted by running the following +command: + % ls -1 /sys/class/dma/ + +After a while you will start to get messages about current status or error like +in the original code. + +Note that running a new test will stop any in progress test. + +The following command should return actual state of the test. + % cat /sys/kernel/debug/dmatest/run + +To wait for test done the user may perform a busy loop that checks the state. + + % while [ $(cat /sys/kernel/debug/dmatest/run) = "Y" ] + > do + > echo -n "." + > sleep 1 + > done + > echo + + Part 3 - When built-in in the kernel... + +The module parameters that is supplied to the kernel command line will be used +for the first performed test. After user gets a control, the test could be +interrupted or re-run with same or different parameters. For the details see +the above section "Part 2 - When dmatest is built as a module..." + +In both cases the module parameters are used as initial values for the test case. +You always could check them at run-time by running + % grep -H . /sys/module/dmatest/parameters/* + + Part 4 - Gathering the test results + +The module provides a storage for the test results in the memory. The gathered +data could be used after test is done. + +The special file 'results' in the debugfs represents gathered data of the in +progress test. The messages collected are printed to the kernel log as well. + +Example of output: + % cat /sys/kernel/debug/dmatest/results + dma0chan0-copy0: #1: No errors with src_off=0x7bf dst_off=0x8ad len=0x3fea (0) + +The message format is unified across the different types of errors. A number in +the parens represents additional information, e.g. error code, error counter, +or status. + +Comparison between buffers is stored to the dedicated structure. + +Note that the verify result is now accessible only via file 'results' in the +debugfs. diff --git a/Documentation/filesystems/btrfs.txt b/Documentation/filesystems/btrfs.txt index 7671352216f1..b349d57b76ea 100644 --- a/Documentation/filesystems/btrfs.txt +++ b/Documentation/filesystems/btrfs.txt @@ -1,8 +1,8 @@ - BTRFS - ===== +BTRFS +===== -Btrfs is a new copy on write filesystem for Linux aimed at +Btrfs is a copy on write filesystem for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration. Initially developed by Oracle, Btrfs is licensed under the GPL and open for contribution from anyone. @@ -34,9 +34,175 @@ The main Btrfs features include: * Online filesystem defragmentation +Mount Options +============= - MAILING LIST - ============ +When mounting a btrfs filesystem, the following option are accepted. +Unless otherwise specified, all options default to off. + + alloc_start= + Debugging option to force all block allocations above a certain + byte threshold on each block device. The value is specified in + bytes, optionally with a K, M, or G suffix, case insensitive. + Default is 1MB. + + autodefrag + Detect small random writes into files and queue them up for the + defrag process. Works best for small files; Not well suited for + large database workloads. + + check_int + check_int_data + check_int_print_mask= + These debugging options control the behavior of the integrity checking + module (the BTRFS_FS_CHECK_INTEGRITY config option required). + + check_int enables the integrity checker module, which examines all + block write requests to ensure on-disk consistency, at a large + memory and CPU cost. + + check_int_data includes extent data in the integrity checks, and + implies the check_int option. + + check_int_print_mask takes a bitmask of BTRFSIC_PRINT_MASK_* values + as defined in fs/btrfs/check-integrity.c, to control the integrity + checker module behavior. + + See comments at the top of fs/btrfs/check-integrity.c for more info. + + compress + compress= + compress-force + compress-force= + Control BTRFS file data compression. Type may be specified as "zlib" + "lzo" or "no" (for no compression, used for remounting). If no type + is specified, zlib is used. If compress-force is specified, + all files will be compressed, whether or not they compress well. + If compression is enabled, nodatacow and nodatasum are disabled. + + degraded + Allow mounts to continue with missing devices. A read-write mount may + fail with too many devices missing, for example if a stripe member + is completely missing. + + device= + Specify a device during mount so that ioctls on the control device + can be avoided. Especialy useful when trying to mount a multi-device + setup as root. May be specified multiple times for multiple devices. + + discard + Issue frequent commands to let the block device reclaim space freed by + the filesystem. This is useful for SSD devices, thinly provisioned + LUNs and virtual machine images, but may have a significant + performance impact. (The fstrim command is also available to + initiate batch trims from userspace). + + enospc_debug + Debugging option to be more verbose in some ENOSPC conditions. + + fatal_errors= + Action to take when encountering a fatal error: + "bug" - BUG() on a fatal error. This is the default. + "panic" - panic() on a fatal error. + + flushoncommit + The 'flushoncommit' mount option forces any data dirtied by a write in a + prior transaction to commit as part of the current commit. This makes + the committed state a fully consistent view of the file system from the + application's perspective (i.e., it includes all completed file system + operations). This was previously the behavior only when a snapshot is + created. + + inode_cache + Enable free inode number caching. Defaults to off due to an overflow + problem when the free space crcs don't fit inside a single page. + + max_inline= + Specify the maximum amount of space, in bytes, that can be inlined in + a metadata B-tree leaf. The value is specified in bytes, optionally + with a K, M, or G suffix, case insensitive. In practice, this value + is limited by the root sector size, with some space unavailable due + to leaf headers. For a 4k sectorsize, max inline data is ~3900 bytes. + + metadata_ratio= + Specify that 1 metadata chunk should be allocated after every + data chunks. Off by default. + + noacl + Disable support for Posix Access Control Lists (ACLs). See the + acl(5) manual page for more information about ACLs. + + nobarrier + Disables the use of block layer write barriers. Write barriers ensure + that certain IOs make it through the device cache and are on persistent + storage. If used on a device with a volatile (non-battery-backed) + write-back cache, this option will lead to filesystem corruption on a + system crash or power loss. + + nodatacow + Disable data copy-on-write for newly created files. Implies nodatasum, + and disables all compression. + + nodatasum + Disable data checksumming for newly created files. + + notreelog + Disable the tree logging used for fsync and O_SYNC writes. + + recovery + Enable autorecovery attempts if a bad tree root is found at mount time. + Currently this scans a list of several previous tree roots and tries to + use the first readable. + + skip_balance + Skip automatic resume of interrupted balance operation after mount. + May be resumed with "btrfs balance resume." + + space_cache (*) + Enable the on-disk freespace cache. + nospace_cache + Disable freespace cache loading without clearing the cache. + clear_cache + Force clearing and rebuilding of the disk space cache if something + has gone wrong. + + ssd + nossd + ssd_spread + Options to control ssd allocation schemes. By default, BTRFS will + enable or disable ssd allocation heuristics depending on whether a + rotational or nonrotational disk is in use. The ssd and nossd options + can override this autodetection. + + The ssd_spread mount option attempts to allocate into big chunks + of unused space, and may perform better on low-end ssds. ssd_spread + implies ssd, enabling all other ssd heuristics as well. + + subvol= + Mount subvolume at rather than the root subvolume. is + relative to the top level subvolume. + + subvolid= + Mount subvolume specified by an ID number rather than the root subvolume. + This allows mounting of subvolumes which are not in the root of the mounted + filesystem. + You can use "btrfs subvolume list" to see subvolume ID numbers. + + subvolrootid= (deprecated) + Mount subvolume specified by rather than the root subvolume. + This allows mounting of subvolumes which are not in the root of the mounted + filesystem. + You can use "btrfs subvolume show " to see the object ID for a subvolume. + + thread_pool= + The number of worker threads to allocate. The default number is equal + to the number of CPUs + 2, or 8, whichever is smaller. + + user_subvol_rm_allowed + Allow subvolumes to be deleted by a non-root user. Use with caution. + +MAILING LIST +============ There is a Btrfs mailing list hosted on vger.kernel.org. You can find details on how to subscribe here: @@ -49,8 +215,8 @@ http://dir.gmane.org/gmane.comp.file-systems.btrfs - IRC - === +IRC +=== Discussion of Btrfs also occurs on the #btrfs channel of the Freenode IRC network. diff --git a/Documentation/filesystems/f2fs.txt b/Documentation/filesystems/f2fs.txt index dcf338e62b71..bd3c56c67380 100644 --- a/Documentation/filesystems/f2fs.txt +++ b/Documentation/filesystems/f2fs.txt @@ -146,7 +146,7 @@ USAGE Format options -------------- --l [label] : Give a volume label, up to 256 unicode name. +-l [label] : Give a volume label, up to 512 unicode name. -a [0 or 1] : Split start location of each area for heap-based allocation. 1 is set by default, which performs this. -o [int] : Set overprovision ratio in percent over volume size. @@ -156,6 +156,8 @@ Format options -z [int] : Set the number of sections per zone. 1 is set by default. -e [str] : Set basic extension list. e.g. "mp3,gif,mov" +-t [0 or 1] : Disable discard command or not. + 1 is set by default, which conducts discard. ================================================================================ DESIGN diff --git a/Documentation/gpio.txt b/Documentation/gpio.txt index 77a1d11af723..6f83fa965b4b 100644 --- a/Documentation/gpio.txt +++ b/Documentation/gpio.txt @@ -72,11 +72,11 @@ in this document, but drivers acting as clients to the GPIO interface must not care how it's implemented.) That said, if the convention is supported on their platform, drivers should -use it when possible. Platforms must declare GENERIC_GPIO support in their -Kconfig (boolean true), and provide an file. Drivers that can't -work without standard GPIO calls should have Kconfig entries which depend -on GENERIC_GPIO. The GPIO calls are available, either as "real code" or as -optimized-away stubs, when drivers use the include file: +use it when possible. Platforms must select ARCH_REQUIRE_GPIOLIB or +ARCH_WANT_OPTIONAL_GPIOLIB in their Kconfig. Drivers that can't work without +standard GPIO calls should have Kconfig entries which depend on GPIOLIB. The +GPIO calls are available, either as "real code" or as optimized-away stubs, +when drivers use the include file: #include diff --git a/Documentation/thermal/exynos_thermal_emulation b/Documentation/thermal/exynos_thermal_emulation index b73bbfb697bb..36a3e79c1203 100644 --- a/Documentation/thermal/exynos_thermal_emulation +++ b/Documentation/thermal/exynos_thermal_emulation @@ -13,11 +13,11 @@ Thermal emulation mode supports software debug for TMU's operation. User can set manually with software code and TMU will read current temperature from user value not from sensor's value. -Enabling CONFIG_EXYNOS_THERMAL_EMUL option will make this support in available. -When it's enabled, sysfs node will be created under -/sys/bus/platform/devices/'exynos device name'/ with name of 'emulation'. +Enabling CONFIG_THERMAL_EMULATION option will make this support available. +When it's enabled, sysfs node will be created as +/sys/devices/virtual/thermal/thermal_zone'zone id'/emul_temp. -The sysfs node, 'emulation', will contain value 0 for the initial state. When you input any +The sysfs node, 'emul_node', will contain value 0 for the initial state. When you input any temperature you want to update to sysfs node, it automatically enable emulation mode and current temperature will be changed into it. (Exynos also supports user changable delay time which would be used to delay of diff --git a/Documentation/thermal/sysfs-api.txt b/Documentation/thermal/sysfs-api.txt index 6859661c9d31..a71bd5b90fe8 100644 --- a/Documentation/thermal/sysfs-api.txt +++ b/Documentation/thermal/sysfs-api.txt @@ -31,15 +31,17 @@ temperature) and throttle appropriate devices. 1. thermal sysfs driver interface functions 1.1 thermal zone device interface -1.1.1 struct thermal_zone_device *thermal_zone_device_register(char *name, +1.1.1 struct thermal_zone_device *thermal_zone_device_register(char *type, int trips, int mask, void *devdata, - struct thermal_zone_device_ops *ops) + struct thermal_zone_device_ops *ops, + const struct thermal_zone_params *tzp, + int passive_delay, int polling_delay)) This interface function adds a new thermal zone device (sensor) to /sys/class/thermal folder as thermal_zone[0-*]. It tries to bind all the thermal cooling devices registered at the same time. - name: the thermal zone name. + type: the thermal zone type. trips: the total number of trip points this thermal zone supports. mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable. devdata: device private data @@ -57,6 +59,12 @@ temperature) and throttle appropriate devices. will be fired. .set_emul_temp: set the emulation temperature which helps in debugging different threshold temperature points. + tzp: thermal zone platform parameters. + passive_delay: number of milliseconds to wait between polls when + performing passive cooling. + polling_delay: number of milliseconds to wait between polls when checking + whether trip points have been crossed (0 for interrupt driven systems). + 1.1.2 void thermal_zone_device_unregister(struct thermal_zone_device *tz) @@ -265,6 +273,10 @@ emul_temp Unit: millidegree Celsius WO, Optional + WARNING: Be careful while enabling this option on production systems, + because userland can easily disable the thermal policy by simply + flooding this sysfs node with low temperature values. + ***************************** * Cooling device attributes * ***************************** @@ -363,7 +375,7 @@ This function returns the thermal_instance corresponding to a given {thermal_zone, cooling_device, trip_point} combination. Returns NULL if such an instance does not exist. -5.3:notify_thermal_framework: +5.3:thermal_notify_framework: This function handles the trip events from sensor drivers. It starts throttling the cooling devices according to the policy configured. For CRITICAL and HOT trip points, this notifies the respective drivers, @@ -375,11 +387,3 @@ platform data is provided, this uses the step_wise throttling policy. This function serves as an arbitrator to set the state of a cooling device. It sets the cooling device to the deepest cooling state if possible. - -5.5:thermal_register_governor: -This function lets the various thermal governors to register themselves -with the Thermal framework. At run time, depending on a zone's platform -data, a particular governor is used for throttling. - -5.6:thermal_unregister_governor: -This function unregisters a governor from the thermal framework. diff --git a/Documentation/xtensa/mmu.txt b/Documentation/xtensa/mmu.txt new file mode 100644 index 000000000000..2b1af7606d57 --- /dev/null +++ b/Documentation/xtensa/mmu.txt @@ -0,0 +1,46 @@ +MMUv3 initialization sequence. + +The code in the initialize_mmu macro sets up MMUv3 memory mapping +identically to MMUv2 fixed memory mapping. Depending on +CONFIG_INITIALIZE_XTENSA_MMU_INSIDE_VMLINUX symbol this code is +located in one of the following address ranges: + + 0xF0000000..0xFFFFFFFF (will keep same address in MMU v2 layout; + typically ROM) + 0x00000000..0x07FFFFFF (system RAM; this code is actually linked + at 0xD0000000..0xD7FFFFFF [cached] + or 0xD8000000..0xDFFFFFFF [uncached]; + in any case, initially runs elsewhere + than linked, so have to be careful) + +The code has the following assumptions: + This code fragment is run only on an MMU v3. + TLBs are in their reset state. + ITLBCFG and DTLBCFG are zero (reset state). + RASID is 0x04030201 (reset state). + PS.RING is zero (reset state). + LITBASE is zero (reset state, PC-relative literals); required to be PIC. + +TLB setup proceeds along the following steps. + + Legend: + VA = virtual address (two upper nibbles of it); + PA = physical address (two upper nibbles of it); + pc = physical range that contains this code; + +After step 2, we jump to virtual address in 0x40000000..0x5fffffff +that corresponds to next instruction to execute in this code. +After step 4, we jump to intended (linked) address of this code. + + Step 0 Step1 Step 2 Step3 Step 4 Step5 + ============ ===== ============ ===== ============ ===== + VA PA PA VA PA PA VA PA PA + ------ -- -- ------ -- -- ------ -- -- + E0..FF -> E0 -> E0 E0..FF -> E0 F0..FF -> F0 -> F0 + C0..DF -> C0 -> C0 C0..DF -> C0 E0..EF -> F0 -> F0 + A0..BF -> A0 -> A0 A0..BF -> A0 D8..DF -> 00 -> 00 + 80..9F -> 80 -> 80 80..9F -> 80 D0..D7 -> 00 -> 00 + 60..7F -> 60 -> 60 60..7F -> 60 + 40..5F -> 40 40..5F -> pc -> pc 40..5F -> pc + 20..3F -> 20 -> 20 20..3F -> 20 + 00..1F -> 00 -> 00 00..1F -> 00 diff --git a/Documentation/zh_CN/gpio.txt b/Documentation/zh_CN/gpio.txt index 4fa7b4e6f856..d5b8f01833f4 100644 --- a/Documentation/zh_CN/gpio.txt +++ b/Documentation/zh_CN/gpio.txt @@ -84,10 +84,10 @@ GPIO 公约 控制器的抽象函数来实现它。(有一些可选的代码能支持这种策略的实现,本文档 后面会介绍,但作为 GPIO 接口的客户端驱动程序必须与它的实现无关。) -也就是说,如果在他们的平台上支持这个公约,驱动应尽可能的使用它。平台 -必须在 Kconfig 中声明对 GENERIC_GPIO的支持 (布尔型 true),并提供 -一个 文件。那些调用标准 GPIO 函数的驱动应该在 Kconfig -入口中声明依赖GENERIC_GPIO。当驱动包含文件: +也就是说,如果在他们的平台上支持这个公约,驱动应尽可能的使用它。同时,平台 +必须在 Kconfig 中选择 ARCH_REQUIRE_GPIOLIB 或者 ARCH_WANT_OPTIONAL_GPIOLIB +选项。那些调用标准 GPIO 函数的驱动应该在 Kconfig 入口中声明依赖GENERIC_GPIO。 +当驱动包含文件: #include diff --git a/MAINTAINERS b/MAINTAINERS index e73c374483cb..3d7782b9f90d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1620,6 +1620,13 @@ W: http://www.baycom.org/~tom/ham/ham.html S: Maintained F: drivers/net/hamradio/baycom* +BCACHE (BLOCK LAYER CACHE) +M: Kent Overstreet +L: linux-bcache@vger.kernel.org +W: http://bcache.evilpiepirate.org +S: Maintained: +F: drivers/md/bcache/ + BEFS FILE SYSTEM S: Orphan F: Documentation/filesystems/befs.txt @@ -8022,11 +8029,14 @@ F: arch/xtensa/ THERMAL M: Zhang Rui +M: Eduardo Valentin L: linux-pm@vger.kernel.org T: git git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux.git +Q: https://patchwork.kernel.org/project/linux-pm/list/ S: Supported F: drivers/thermal/ F: include/linux/thermal.h +F: include/linux/cpu_cooling.h THINGM BLINK(1) USB RGB LED DRIVER M: Vivien Didelot diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index 8629127640cf..837a1f2d8b96 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -55,9 +55,6 @@ config GENERIC_CALIBRATE_DELAY bool default y -config GENERIC_GPIO - bool - config ZONE_DMA bool default y diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index e6f4eca09ee3..5917099470ea 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -16,8 +16,6 @@ config ARC select GENERIC_FIND_FIRST_BIT # for now, we don't need GENERIC_IRQ_PROBE, CONFIG_GENERIC_IRQ_CHIP select GENERIC_IRQ_SHOW - select GENERIC_KERNEL_EXECVE - select GENERIC_KERNEL_THREAD select GENERIC_PENDING_IRQ if SMP select GENERIC_SMP_IDLE_THREAD select HAVE_ARCH_KGDB @@ -61,9 +59,6 @@ config GENERIC_CALIBRATE_DELAY config GENERIC_HWEIGHT def_bool y -config BINFMT_ELF - def_bool y - config STACKTRACE_SUPPORT def_bool y select STACKTRACE @@ -82,6 +77,7 @@ menu "ARC Architecture Configuration" menu "ARC Platform/SoC/Board" source "arch/arc/plat-arcfpga/Kconfig" +source "arch/arc/plat-tb10x/Kconfig" #New platform adds here endmenu @@ -134,9 +130,6 @@ if SMP config ARC_HAS_COH_CACHES def_bool n -config ARC_HAS_COH_LLSC - def_bool n - config ARC_HAS_COH_RTSC def_bool n @@ -189,6 +182,10 @@ config ARC_CACHE_PAGES Note that Global I/D ENABLE + Per Page DISABLE works but corollary Global DISABLE + Per Page ENABLE won't work +config ARC_CACHE_VIPT_ALIASING + bool "Support VIPT Aliasing D$" + default n + endif #ARC_CACHE config ARC_HAS_ICCM @@ -304,6 +301,9 @@ config ARC_FPU_SAVE_RESTORE based on actual usage of FPU by a task. Thus our implemn does this for all tasks in system. +config ARC_CANT_LLSC + def_bool n + menuconfig ARC_CPU_REL_4_10 bool "Enable support for Rel 4.10 features" default n @@ -314,9 +314,7 @@ menuconfig ARC_CPU_REL_4_10 config ARC_HAS_LLSC bool "Insn: LLOCK/SCOND (efficient atomic ops)" default y - depends on ARC_CPU_770 - # if SMP, enable LLSC ONLY if ARC implementation has coherent atomics - depends on !SMP || ARC_HAS_COH_LLSC + depends on ARC_CPU_770 && !ARC_CANT_LLSC config ARC_HAS_SWAPE bool "Insn: SWAPE (endian-swap)" @@ -415,13 +413,6 @@ config ARC_DBG_TLB_MISS_COUNT Counts number of I and D TLB Misses and exports them via Debugfs The counters can be cleared via Debugfs as well -config CMDLINE - string "Kernel command line to built-in" - default "print-fatal-signals=1" - help - The default command line which will be appended to the optional - u-boot provided command line (see below) - config CMDLINE_UBOOT bool "Support U-boot kernel command line passing" default n @@ -430,8 +421,8 @@ config CMDLINE_UBOOT command line from the U-boot environment to the Linux kernel then switch this option on. ARC U-boot will setup the cmdline in RAM/flash and set r2 to point - to it. kernel startup code will copy the string into cmdline buffer - and also append CONFIG_CMDLINE. + to it. kernel startup code will append this to DeviceTree + /bootargs provided cmdline args. config ARC_BUILTIN_DTB_NAME string "Built in DTB" @@ -441,6 +432,10 @@ config ARC_BUILTIN_DTB_NAME source "kernel/Kconfig.preempt" +menu "Executable file formats" +source "fs/Kconfig.binfmt" +endmenu + endmenu # "ARC Architecture Configuration" source "mm/Kconfig" diff --git a/arch/arc/Makefile b/arch/arc/Makefile index 92379c7cbc1a..183397fd289e 100644 --- a/arch/arc/Makefile +++ b/arch/arc/Makefile @@ -8,6 +8,10 @@ UTS_MACHINE := arc +ifeq ($(CROSS_COMPILE),) +CROSS_COMPILE := arc-elf32- +endif + KBUILD_DEFCONFIG := fpga_defconfig cflags-y += -mA7 -fno-common -pipe -fno-builtin -D__linux__ @@ -87,20 +91,23 @@ core-y += arch/arc/ core-y += arch/arc/boot/dts/ core-$(CONFIG_ARC_PLAT_FPGA_LEGACY) += arch/arc/plat-arcfpga/ +core-$(CONFIG_ARC_PLAT_TB10X) += arch/arc/plat-tb10x/ drivers-$(CONFIG_OPROFILE) += arch/arc/oprofile/ libs-y += arch/arc/lib/ $(LIBGCC) +boot := arch/arc/boot + #default target for make without any arguements. -KBUILD_IMAGE := bootpImage +KBUILD_IMAGE := bootpImage all: $(KBUILD_IMAGE) -boot := arch/arc/boot - bootpImage: vmlinux -uImage: vmlinux +boot_targets += uImage uImage.bin uImage.gz + +$(boot_targets): vmlinux $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@ %.dtb %.dtb.S %.dtb.o: scripts diff --git a/arch/arc/boot/Makefile b/arch/arc/boot/Makefile index 7d514c24e095..e597cb34c16a 100644 --- a/arch/arc/boot/Makefile +++ b/arch/arc/boot/Makefile @@ -3,7 +3,6 @@ targets := vmlinux.bin vmlinux.bin.gz uImage # uImage build relies on mkimage being availble on your host for ARC target # You will need to build u-boot for ARC, rename mkimage to arc-elf32-mkimage # and make sure it's reacable from your PATH -MKIMAGE := $(srctree)/scripts/mkuboot.sh OBJCOPYFLAGS= -O binary -R .note -R .note.gnu.build-id -R .comment -S @@ -12,7 +11,12 @@ LINUX_START_TEXT = $$(readelf -h vmlinux | \ UIMAGE_LOADADDR = $(CONFIG_LINUX_LINK_BASE) UIMAGE_ENTRYADDR = $(LINUX_START_TEXT) -UIMAGE_COMPRESSION = gzip + +suffix-y := bin +suffix-$(CONFIG_KERNEL_GZIP) := gz + +targets += uImage uImage.bin uImage.gz +extra-y += vmlinux.bin vmlinux.bin.gz $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) @@ -20,7 +24,12 @@ $(obj)/vmlinux.bin: vmlinux FORCE $(obj)/vmlinux.bin.gz: $(obj)/vmlinux.bin FORCE $(call if_changed,gzip) -$(obj)/uImage: $(obj)/vmlinux.bin.gz FORCE - $(call if_changed,uimage) +$(obj)/uImage.bin: $(obj)/vmlinux.bin FORCE + $(call if_changed,uimage,none) + +$(obj)/uImage.gz: $(obj)/vmlinux.bin.gz FORCE + $(call if_changed,uimage,gzip) -PHONY += FORCE +$(obj)/uImage: $(obj)/uImage.$(suffix-y) + @ln -sf $(notdir $<) $@ + @echo ' Image $@ is ready' diff --git a/arch/arc/boot/dts/Makefile b/arch/arc/boot/dts/Makefile index 5776835d583f..faf240e29ec2 100644 --- a/arch/arc/boot/dts/Makefile +++ b/arch/arc/boot/dts/Makefile @@ -8,6 +8,8 @@ endif obj-y += $(builtindtb-y).dtb.o targets += $(builtindtb-y).dtb +.SECONDARY: $(obj)/$(builtindtb-y).dtb.S + dtbs: $(addprefix $(obj)/, $(builtindtb-y).dtb) -clean-files := *.dtb +clean-files := *.dtb *.dtb.S diff --git a/arch/arc/boot/dts/abilis_tb100.dtsi b/arch/arc/boot/dts/abilis_tb100.dtsi new file mode 100644 index 000000000000..941ad118a7e7 --- /dev/null +++ b/arch/arc/boot/dts/abilis_tb100.dtsi @@ -0,0 +1,340 @@ +/* + * Abilis Systems TB100 SOC device tree + * + * Copyright (C) Abilis Systems 2013 + * + * Author: Christian Ruppert + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +/include/ "abilis_tb10x.dtsi" + +/* interrupt specifiers + * -------------------- + * 0: rising, 1: low, 2: high, 3: falling, + */ + +/ { + clock-frequency = <500000000>; /* 500 MHZ */ + + soc100 { + bus-frequency = <166666666>; + + pll0: oscillator { + clock-frequency = <1000000000>; + }; + cpu_clk: clkdiv_cpu { + clock-mult = <1>; + clock-div = <2>; + }; + ahb_clk: clkdiv_ahb { + clock-mult = <1>; + clock-div = <6>; + }; + + iomux: iomux@FF10601c { + /* Port 1 */ + pctl_tsin_s0: pctl-tsin-s0 { /* Serial TS-in 0 */ + pingrp = "mis0_pins"; + }; + pctl_tsin_s1: pctl-tsin-s1 { /* Serial TS-in 1 */ + pingrp = "mis1_pins"; + }; + pctl_gpio_a: pctl-gpio-a { /* GPIO bank A */ + pingrp = "gpioa_pins"; + }; + pctl_tsin_p1: pctl-tsin-p1 { /* Parallel TS-in 1 */ + pingrp = "mip1_pins"; + }; + /* Port 2 */ + pctl_tsin_s2: pctl-tsin-s2 { /* Serial TS-in 2 */ + pingrp = "mis2_pins"; + }; + pctl_tsin_s3: pctl-tsin-s3 { /* Serial TS-in 3 */ + pingrp = "mis3_pins"; + }; + pctl_gpio_c: pctl-gpio-c { /* GPIO bank C */ + pingrp = "gpioc_pins"; + }; + pctl_tsin_p3: pctl-tsin-p3 { /* Parallel TS-in 3 */ + pingrp = "mip3_pins"; + }; + /* Port 3 */ + pctl_tsin_s4: pctl-tsin-s4 { /* Serial TS-in 4 */ + pingrp = "mis4_pins"; + }; + pctl_tsin_s5: pctl-tsin-s5 { /* Serial TS-in 5 */ + pingrp = "mis5_pins"; + }; + pctl_gpio_e: pctl-gpio-e { /* GPIO bank E */ + pingrp = "gpioe_pins"; + }; + pctl_tsin_p5: pctl-tsin-p5 { /* Parallel TS-in 5 */ + pingrp = "mip5_pins"; + }; + /* Port 4 */ + pctl_tsin_s6: pctl-tsin-s6 { /* Serial TS-in 6 */ + pingrp = "mis6_pins"; + }; + pctl_tsin_s7: pctl-tsin-s7 { /* Serial TS-in 7 */ + pingrp = "mis7_pins"; + }; + pctl_gpio_g: pctl-gpio-g { /* GPIO bank G */ + pingrp = "gpiog_pins"; + }; + pctl_tsin_p7: pctl-tsin-p7 { /* Parallel TS-in 7 */ + pingrp = "mip7_pins"; + }; + /* Port 5 */ + pctl_gpio_j: pctl-gpio-j { /* GPIO bank J */ + pingrp = "gpioj_pins"; + }; + pctl_gpio_k: pctl-gpio-k { /* GPIO bank K */ + pingrp = "gpiok_pins"; + }; + pctl_ciplus: pctl-ciplus { /* CI+ interface */ + pingrp = "ciplus_pins"; + }; + pctl_mcard: pctl-mcard { /* M-Card interface */ + pingrp = "mcard_pins"; + }; + /* Port 6 */ + pctl_tsout_p: pctl-tsout-p { /* Parallel TS-out */ + pingrp = "mop_pins"; + }; + pctl_tsout_s0: pctl-tsout-s0 { /* Serial TS-out 0 */ + pingrp = "mos0_pins"; + }; + pctl_tsout_s1: pctl-tsout-s1 { /* Serial TS-out 1 */ + pingrp = "mos1_pins"; + }; + pctl_tsout_s2: pctl-tsout-s2 { /* Serial TS-out 2 */ + pingrp = "mos2_pins"; + }; + pctl_tsout_s3: pctl-tsout-s3 { /* Serial TS-out 3 */ + pingrp = "mos3_pins"; + }; + /* Port 7 */ + pctl_uart0: pctl-uart0 { /* UART 0 */ + pingrp = "uart0_pins"; + }; + pctl_uart1: pctl-uart1 { /* UART 1 */ + pingrp = "uart1_pins"; + }; + pctl_gpio_l: pctl-gpio-l { /* GPIO bank L */ + pingrp = "gpiol_pins"; + }; + pctl_gpio_m: pctl-gpio-m { /* GPIO bank M */ + pingrp = "gpiom_pins"; + }; + /* Port 8 */ + pctl_spi3: pctl-spi3 { + pingrp = "spi3_pins"; + }; + /* Port 9 */ + pctl_spi1: pctl-spi1 { + pingrp = "spi1_pins"; + }; + pctl_gpio_n: pctl-gpio-n { + pingrp = "gpion_pins"; + }; + /* Unmuxed GPIOs */ + pctl_gpio_b: pctl-gpio-b { + pingrp = "gpiob_pins"; + }; + pctl_gpio_d: pctl-gpio-d { + pingrp = "gpiod_pins"; + }; + pctl_gpio_f: pctl-gpio-f { + pingrp = "gpiof_pins"; + }; + pctl_gpio_h: pctl-gpio-h { + pingrp = "gpioh_pins"; + }; + pctl_gpio_i: pctl-gpio-i { + pingrp = "gpioi_pins"; + }; + }; + + gpioa: gpio@FF140000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF140000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <0>; + gpio-pins = <&pctl_gpio_a>; + }; + gpiob: gpio@FF141000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF141000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <3>; + gpio-pins = <&pctl_gpio_b>; + }; + gpioc: gpio@FF142000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF142000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <5>; + gpio-pins = <&pctl_gpio_c>; + }; + gpiod: gpio@FF143000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF143000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <8>; + gpio-pins = <&pctl_gpio_d>; + }; + gpioe: gpio@FF144000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF144000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <10>; + gpio-pins = <&pctl_gpio_e>; + }; + gpiof: gpio@FF145000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF145000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <13>; + gpio-pins = <&pctl_gpio_f>; + }; + gpiog: gpio@FF146000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF146000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <15>; + gpio-pins = <&pctl_gpio_g>; + }; + gpioh: gpio@FF147000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF147000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <18>; + gpio-pins = <&pctl_gpio_h>; + }; + gpioi: gpio@FF148000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF148000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <20>; + gpio-pins = <&pctl_gpio_i>; + }; + gpioj: gpio@FF149000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF149000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <32>; + gpio-pins = <&pctl_gpio_j>; + }; + gpiok: gpio@FF14a000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF14A000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <64>; + gpio-pins = <&pctl_gpio_k>; + }; + gpiol: gpio@FF14b000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF14B000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <86>; + gpio-pins = <&pctl_gpio_l>; + }; + gpiom: gpio@FF14c000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF14C000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <90>; + gpio-pins = <&pctl_gpio_m>; + }; + gpion: gpio@FF14d000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF14D000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <94>; + gpio-pins = <&pctl_gpio_n>; + }; + }; +}; diff --git a/arch/arc/boot/dts/abilis_tb100_dvk.dts b/arch/arc/boot/dts/abilis_tb100_dvk.dts new file mode 100644 index 000000000000..c0fd3623c393 --- /dev/null +++ b/arch/arc/boot/dts/abilis_tb100_dvk.dts @@ -0,0 +1,127 @@ +/* + * Abilis Systems TB100 Development Kit PCB device tree + * + * Copyright (C) Abilis Systems 2013 + * + * Author: Christian Ruppert + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +/dts-v1/; + +/include/ "abilis_tb100.dtsi" + +/ { + chosen { + bootargs = "earlycon=uart8250,mmio32,0xff100000,9600n8 console=ttyS0,9600n8"; + }; + + aliases { }; + + memory { + device_type = "memory"; + reg = <0x80000000 0x08000000>; /* 128M */ + }; + + soc100 { + uart@FF100000 { + pinctrl-names = "abilis,simple-default"; + pinctrl-0 = <&pctl_uart0>; + }; + ethernet@FE100000 { + phy-mode = "rgmii"; + }; + + i2c0: i2c@FF120000 { + sda-hold-time = <432>; + }; + i2c1: i2c@FF121000 { + sda-hold-time = <432>; + }; + i2c2: i2c@FF122000 { + sda-hold-time = <432>; + }; + i2c3: i2c@FF123000 { + sda-hold-time = <432>; + }; + i2c4: i2c@FF124000 { + sda-hold-time = <432>; + }; + + leds { + compatible = "gpio-leds"; + power { + label = "Power"; + gpios = <&gpioi 0>; + linux,default-trigger = "default-on"; + }; + heartbeat { + label = "Heartbeat"; + gpios = <&gpioi 1>; + linux,default-trigger = "heartbeat"; + }; + led2 { + label = "LED2"; + gpios = <&gpioi 2>; + default-state = "off"; + }; + led3 { + label = "LED3"; + gpios = <&gpioi 3>; + default-state = "off"; + }; + led4 { + label = "LED4"; + gpios = <&gpioi 4>; + default-state = "off"; + }; + led5 { + label = "LED5"; + gpios = <&gpioi 5>; + default-state = "off"; + }; + led6 { + label = "LED6"; + gpios = <&gpioi 6>; + default-state = "off"; + }; + led7 { + label = "LED7"; + gpios = <&gpioi 7>; + default-state = "off"; + }; + led8 { + label = "LED8"; + gpios = <&gpioi 8>; + default-state = "off"; + }; + led9 { + label = "LED9"; + gpios = <&gpioi 9>; + default-state = "off"; + }; + led10 { + label = "LED10"; + gpios = <&gpioi 10>; + default-state = "off"; + }; + led11 { + label = "LED11"; + gpios = <&gpioi 11>; + default-state = "off"; + }; + }; + }; +}; diff --git a/arch/arc/boot/dts/abilis_tb101.dtsi b/arch/arc/boot/dts/abilis_tb101.dtsi new file mode 100644 index 000000000000..fd25c212049f --- /dev/null +++ b/arch/arc/boot/dts/abilis_tb101.dtsi @@ -0,0 +1,349 @@ +/* + * Abilis Systems TB101 SOC device tree + * + * Copyright (C) Abilis Systems 2013 + * + * Author: Christian Ruppert + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +/include/ "abilis_tb10x.dtsi" + +/* interrupt specifiers + * -------------------- + * 0: rising, 1: low, 2: high, 3: falling, + */ + +/ { + clock-frequency = <500000000>; /* 500 MHZ */ + + soc100 { + bus-frequency = <166666666>; + + pll0: oscillator { + clock-frequency = <1000000000>; + }; + cpu_clk: clkdiv_cpu { + clock-mult = <1>; + clock-div = <2>; + }; + ahb_clk: clkdiv_ahb { + clock-mult = <1>; + clock-div = <6>; + }; + + iomux: iomux@FF10601c { + /* Port 1 */ + pctl_tsin_s0: pctl-tsin-s0 { /* Serial TS-in 0 */ + pingrp = "mis0_pins"; + }; + pctl_tsin_s1: pctl-tsin-s1 { /* Serial TS-in 1 */ + pingrp = "mis1_pins"; + }; + pctl_gpio_a: pctl-gpio-a { /* GPIO bank A */ + pingrp = "gpioa_pins"; + }; + pctl_tsin_p1: pctl-tsin-p1 { /* Parallel TS-in 1 */ + pingrp = "mip1_pins"; + }; + /* Port 2 */ + pctl_tsin_s2: pctl-tsin-s2 { /* Serial TS-in 2 */ + pingrp = "mis2_pins"; + }; + pctl_tsin_s3: pctl-tsin-s3 { /* Serial TS-in 3 */ + pingrp = "mis3_pins"; + }; + pctl_gpio_c: pctl-gpio-c { /* GPIO bank C */ + pingrp = "gpioc_pins"; + }; + pctl_tsin_p3: pctl-tsin-p3 { /* Parallel TS-in 3 */ + pingrp = "mip3_pins"; + }; + /* Port 3 */ + pctl_tsin_s4: pctl-tsin-s4 { /* Serial TS-in 4 */ + pingrp = "mis4_pins"; + }; + pctl_tsin_s5: pctl-tsin-s5 { /* Serial TS-in 5 */ + pingrp = "mis5_pins"; + }; + pctl_gpio_e: pctl-gpio-e { /* GPIO bank E */ + pingrp = "gpioe_pins"; + }; + pctl_tsin_p5: pctl-tsin-p5 { /* Parallel TS-in 5 */ + pingrp = "mip5_pins"; + }; + /* Port 4 */ + pctl_tsin_s6: pctl-tsin-s6 { /* Serial TS-in 6 */ + pingrp = "mis6_pins"; + }; + pctl_tsin_s7: pctl-tsin-s7 { /* Serial TS-in 7 */ + pingrp = "mis7_pins"; + }; + pctl_gpio_g: pctl-gpio-g { /* GPIO bank G */ + pingrp = "gpiog_pins"; + }; + pctl_tsin_p7: pctl-tsin-p7 { /* Parallel TS-in 7 */ + pingrp = "mip7_pins"; + }; + /* Port 5 */ + pctl_gpio_j: pctl-gpio-j { /* GPIO bank J */ + pingrp = "gpioj_pins"; + }; + pctl_gpio_k: pctl-gpio-k { /* GPIO bank K */ + pingrp = "gpiok_pins"; + }; + pctl_ciplus: pctl-ciplus { /* CI+ interface */ + pingrp = "ciplus_pins"; + }; + pctl_mcard: pctl-mcard { /* M-Card interface */ + pingrp = "mcard_pins"; + }; + pctl_stc0: pctl-stc0 { /* Smart card I/F 0 */ + pingrp = "stc0_pins"; + }; + pctl_stc1: pctl-stc1 { /* Smart card I/F 1 */ + pingrp = "stc1_pins"; + }; + /* Port 6 */ + pctl_tsout_p: pctl-tsout-p { /* Parallel TS-out */ + pingrp = "mop_pins"; + }; + pctl_tsout_s0: pctl-tsout-s0 { /* Serial TS-out 0 */ + pingrp = "mos0_pins"; + }; + pctl_tsout_s1: pctl-tsout-s1 { /* Serial TS-out 1 */ + pingrp = "mos1_pins"; + }; + pctl_tsout_s2: pctl-tsout-s2 { /* Serial TS-out 2 */ + pingrp = "mos2_pins"; + }; + pctl_tsout_s3: pctl-tsout-s3 { /* Serial TS-out 3 */ + pingrp = "mos3_pins"; + }; + /* Port 7 */ + pctl_uart0: pctl-uart0 { /* UART 0 */ + pingrp = "uart0_pins"; + }; + pctl_uart1: pctl-uart1 { /* UART 1 */ + pingrp = "uart1_pins"; + }; + pctl_gpio_l: pctl-gpio-l { /* GPIO bank L */ + pingrp = "gpiol_pins"; + }; + pctl_gpio_m: pctl-gpio-m { /* GPIO bank M */ + pingrp = "gpiom_pins"; + }; + /* Port 8 */ + pctl_spi3: pctl-spi3 { + pingrp = "spi3_pins"; + }; + pctl_jtag: pctl-jtag { + pingrp = "jtag_pins"; + }; + /* Port 9 */ + pctl_spi1: pctl-spi1 { + pingrp = "spi1_pins"; + }; + pctl_gpio_n: pctl-gpio-n { + pingrp = "gpion_pins"; + }; + /* Unmuxed GPIOs */ + pctl_gpio_b: pctl-gpio-b { + pingrp = "gpiob_pins"; + }; + pctl_gpio_d: pctl-gpio-d { + pingrp = "gpiod_pins"; + }; + pctl_gpio_f: pctl-gpio-f { + pingrp = "gpiof_pins"; + }; + pctl_gpio_h: pctl-gpio-h { + pingrp = "gpioh_pins"; + }; + pctl_gpio_i: pctl-gpio-i { + pingrp = "gpioi_pins"; + }; + }; + + gpioa: gpio@FF140000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF140000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <0>; + gpio-pins = <&pctl_gpio_a>; + }; + gpiob: gpio@FF141000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF141000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <3>; + gpio-pins = <&pctl_gpio_b>; + }; + gpioc: gpio@FF142000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF142000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <5>; + gpio-pins = <&pctl_gpio_c>; + }; + gpiod: gpio@FF143000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF143000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <8>; + gpio-pins = <&pctl_gpio_d>; + }; + gpioe: gpio@FF144000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF144000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <10>; + gpio-pins = <&pctl_gpio_e>; + }; + gpiof: gpio@FF145000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF145000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <13>; + gpio-pins = <&pctl_gpio_f>; + }; + gpiog: gpio@FF146000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF146000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <15>; + gpio-pins = <&pctl_gpio_g>; + }; + gpioh: gpio@FF147000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF147000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <18>; + gpio-pins = <&pctl_gpio_h>; + }; + gpioi: gpio@FF148000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF148000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <20>; + gpio-pins = <&pctl_gpio_i>; + }; + gpioj: gpio@FF149000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF149000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <32>; + gpio-pins = <&pctl_gpio_j>; + }; + gpiok: gpio@FF14a000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF14A000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <64>; + gpio-pins = <&pctl_gpio_k>; + }; + gpiol: gpio@FF14b000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF14B000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <86>; + gpio-pins = <&pctl_gpio_l>; + }; + gpiom: gpio@FF14c000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF14C000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <90>; + gpio-pins = <&pctl_gpio_m>; + }; + gpion: gpio@FF14d000 { + compatible = "abilis,tb10x-gpio"; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <27 1>; + reg = <0xFF14D000 0x1000>; + gpio-controller; + #gpio-cells = <1>; + gpio-base = <94>; + gpio-pins = <&pctl_gpio_n>; + }; + }; +}; diff --git a/arch/arc/boot/dts/abilis_tb101_dvk.dts b/arch/arc/boot/dts/abilis_tb101_dvk.dts new file mode 100644 index 000000000000..6f8c381f6268 --- /dev/null +++ b/arch/arc/boot/dts/abilis_tb101_dvk.dts @@ -0,0 +1,127 @@ +/* + * Abilis Systems TB101 Development Kit PCB device tree + * + * Copyright (C) Abilis Systems 2013 + * + * Author: Christian Ruppert + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +/dts-v1/; + +/include/ "abilis_tb101.dtsi" + +/ { + chosen { + bootargs = "earlycon=uart8250,mmio32,0xff100000,9600n8 console=ttyS0,9600n8"; + }; + + aliases { }; + + memory { + device_type = "memory"; + reg = <0x80000000 0x08000000>; /* 128M */ + }; + + soc100 { + uart@FF100000 { + pinctrl-names = "abilis,simple-default"; + pinctrl-0 = <&pctl_uart0>; + }; + ethernet@FE100000 { + phy-mode = "rgmii"; + }; + + i2c0: i2c@FF120000 { + sda-hold-time = <432>; + }; + i2c1: i2c@FF121000 { + sda-hold-time = <432>; + }; + i2c2: i2c@FF122000 { + sda-hold-time = <432>; + }; + i2c3: i2c@FF123000 { + sda-hold-time = <432>; + }; + i2c4: i2c@FF124000 { + sda-hold-time = <432>; + }; + + leds { + compatible = "gpio-leds"; + power { + label = "Power"; + gpios = <&gpioi 0>; + linux,default-trigger = "default-on"; + }; + heartbeat { + label = "Heartbeat"; + gpios = <&gpioi 1>; + linux,default-trigger = "heartbeat"; + }; + led2 { + label = "LED2"; + gpios = <&gpioi 2>; + default-state = "off"; + }; + led3 { + label = "LED3"; + gpios = <&gpioi 3>; + default-state = "off"; + }; + led4 { + label = "LED4"; + gpios = <&gpioi 4>; + default-state = "off"; + }; + led5 { + label = "LED5"; + gpios = <&gpioi 5>; + default-state = "off"; + }; + led6 { + label = "LED6"; + gpios = <&gpioi 6>; + default-state = "off"; + }; + led7 { + label = "LED7"; + gpios = <&gpioi 7>; + default-state = "off"; + }; + led8 { + label = "LED8"; + gpios = <&gpioi 8>; + default-state = "off"; + }; + led9 { + label = "LED9"; + gpios = <&gpioi 9>; + default-state = "off"; + }; + led10 { + label = "LED10"; + gpios = <&gpioi 10>; + default-state = "off"; + }; + led11 { + label = "LED11"; + gpios = <&gpioi 11>; + default-state = "off"; + }; + }; + }; +}; diff --git a/arch/arc/boot/dts/abilis_tb10x.dtsi b/arch/arc/boot/dts/abilis_tb10x.dtsi new file mode 100644 index 000000000000..a6139fc5aaa3 --- /dev/null +++ b/arch/arc/boot/dts/abilis_tb10x.dtsi @@ -0,0 +1,247 @@ +/* + * Abilis Systems TB10X SOC device tree + * + * Copyright (C) Abilis Systems 2013 + * + * Author: Christian Ruppert + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +/* interrupt specifiers + * -------------------- + * 0: rising, 1: low, 2: high, 3: falling, + */ + +/ { + compatible = "abilis,arc-tb10x"; + #address-cells = <1>; + #size-cells = <1>; + + cpus { + #address-cells = <1>; + #size-cells = <0>; + cpu@0 { + device_type = "cpu"; + compatible = "snps,arc770d"; + reg = <0>; + }; + }; + + soc100 { + #address-cells = <1>; + #size-cells = <1>; + device_type = "soc"; + ranges = <0xfe000000 0xfe000000 0x02000000 + 0x000F0000 0x000F0000 0x00010000>; + compatible = "abilis,tb10x", "simple-bus"; + + pll0: oscillator { + compatible = "fixed-clock"; + #clock-cells = <0>; + clock-output-names = "pll0"; + }; + cpu_clk: clkdiv_cpu { + compatible = "fixed-factor-clock"; + #clock-cells = <0>; + clocks = <&pll0>; + clock-output-names = "cpu_clk"; + }; + ahb_clk: clkdiv_ahb { + compatible = "fixed-factor-clock"; + #clock-cells = <0>; + clocks = <&pll0>; + clock-output-names = "ahb_clk"; + }; + + iomux: iomux@FF10601c { + #address-cells = <1>; + #size-cells = <1>; + compatible = "abilis,tb10x-iomux"; + reg = <0xFF10601c 0x4>; + }; + + intc: interrupt-controller { + compatible = "snps,arc700-intc"; + interrupt-controller; + #interrupt-cells = <1>; + }; + tb10x_ictl: pic@fe002000 { + compatible = "abilis,tb10x_ictl"; + reg = <0xFE002000 0x20>; + interrupt-controller; + #interrupt-cells = <2>; + interrupt-parent = <&intc>; + interrupts = <5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 + 20 21 22 23 24 25 26 27 28 29 30 31>; + }; + + uart@FF100000 { + compatible = "snps,dw-apb-uart", + "abilis,simple-pinctrl"; + reg = <0xFF100000 0x100>; + clock-frequency = <166666666>; + interrupts = <25 1>; + reg-shift = <2>; + reg-io-width = <4>; + interrupt-parent = <&tb10x_ictl>; + }; + ethernet@FE100000 { + compatible = "snps,dwmac-3.70a","snps,dwmac"; + reg = <0xFE100000 0x1058>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <6 1>; + interrupt-names = "macirq"; + clocks = <&ahb_clk>; + clock-names = "stmmaceth"; + }; + dma@FE000000 { + compatible = "snps,dma-spear1340"; + reg = <0xFE000000 0x400>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <14 1>; + dma-channels = <6>; + dma-requests = <0>; + dma-masters = <1>; + #dma-cells = <3>; + chan_allocation_order = <0>; + chan_priority = <1>; + block_size = <0x7ff>; + data_width = <2 0 0 0>; + clocks = <&ahb_clk>; + clock-names = "hclk"; + }; + + i2c0: i2c@FF120000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,designware-i2c"; + reg = <0xFF120000 0x1000>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <12 1>; + clocks = <&ahb_clk>; + }; + i2c1: i2c@FF121000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,designware-i2c"; + reg = <0xFF121000 0x1000>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <12 1>; + clocks = <&ahb_clk>; + }; + i2c2: i2c@FF122000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,designware-i2c"; + reg = <0xFF122000 0x1000>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <12 1>; + clocks = <&ahb_clk>; + }; + i2c3: i2c@FF123000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,designware-i2c"; + reg = <0xFF123000 0x1000>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <12 1>; + clocks = <&ahb_clk>; + }; + i2c4: i2c@FF124000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,designware-i2c"; + reg = <0xFF124000 0x1000>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <12 1>; + clocks = <&ahb_clk>; + }; + + spi0: spi@0xFE010000 { + #address-cells = <1>; + #size-cells = <0>; + cell-index = <0>; + compatible = "abilis,tb100-spi"; + num-cs = <1>; + reg = <0xFE010000 0x20>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <26 1>; + clocks = <&ahb_clk>; + }; + spi1: spi@0xFE011000 { + #address-cells = <1>; + #size-cells = <0>; + cell-index = <1>; + compatible = "abilis,tb100-spi", + "abilis,simple-pinctrl"; + num-cs = <2>; + reg = <0xFE011000 0x20>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <10 1>; + clocks = <&ahb_clk>; + }; + + tb10x_tsm: tb10x-tsm@ff316000 { + compatible = "abilis,tb100-tsm"; + reg = <0xff316000 0x400>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <17 1>; + output-clkdiv = <4>; + global-packet-delay = <0x21>; + port-packet-delay = <0>; + }; + tb10x_stream_proc: tb10x-stream-proc { + compatible = "abilis,tb100-streamproc"; + reg = <0xfff00000 0x200>, + <0x000f0000 0x10000>, + <0xfff00200 0x105>, + <0xff10600c 0x1>, + <0xfe001018 0x1>; + reg-names = "mbox", + "sp_iccm", + "mbox_irq", + "cpuctrl", + "a6it_int_force"; + interrupt-parent = <&tb10x_ictl>; + interrupts = <20 1>, <19 1>; + interrupt-names = "cmd_irq", "event_irq"; + }; + tb10x_mdsc0: tb10x-mdscr@FF300000 { + compatible = "abilis,tb100-mdscr"; + reg = <0xFF300000 0x7000>; + tb100-mdscr-manage-tsin; + }; + tb10x_mscr0: tb10x-mdscr@FF307000 { + compatible = "abilis,tb100-mdscr"; + reg = <0xFF307000 0x7000>; + }; + tb10x_scr0: tb10x-mdscr@ff30e000 { + compatible = "abilis,tb100-mdscr"; + reg = <0xFF30e000 0x4000>; + tb100-mdscr-manage-tsin; + }; + tb10x_scr1: tb10x-mdscr@ff312000 { + compatible = "abilis,tb100-mdscr"; + reg = <0xFF312000 0x4000>; + tb100-mdscr-manage-tsin; + }; + tb10x_wfb: tb10x-wfb@ff319000 { + compatible = "abilis,tb100-wfb"; + reg = <0xff319000 0x1000>; + interrupt-parent = <&tb10x_ictl>; + interrupts = <16 1>; + }; + }; +}; diff --git a/arch/arc/boot/dts/nsimosci.dts b/arch/arc/boot/dts/nsimosci.dts new file mode 100644 index 000000000000..ea16d782af58 --- /dev/null +++ b/arch/arc/boot/dts/nsimosci.dts @@ -0,0 +1,77 @@ +/* + * Copyright (C) 2013 Synopsys, Inc. (www.synopsys.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +/dts-v1/; + +/include/ "skeleton.dtsi" + +/ { + compatible = "snps,nsimosci"; + clock-frequency = <80000000>; /* 80 MHZ */ + #address-cells = <1>; + #size-cells = <1>; + interrupt-parent = <&intc>; + + chosen { + bootargs = "console=tty0 consoleblank=0"; + }; + + aliases { + serial0 = &uart0; + }; + + memory { + device_type = "memory"; + reg = <0x80000000 0x10000000>; /* 256M */ + }; + + fpga { + compatible = "simple-bus"; + #address-cells = <1>; + #size-cells = <1>; + + /* child and parent address space 1:1 mapped */ + ranges; + + intc: interrupt-controller { + compatible = "snps,arc700-intc"; + interrupt-controller; + #interrupt-cells = <1>; + }; + + uart0: serial@c0000000 { + compatible = "snps,dw-apb-uart"; + reg = <0xc0000000 0x2000>; + interrupts = <11>; + #clock-frequency = <80000000>; + clock-frequency = <3686400>; + baud = <115200>; + reg-shift = <2>; + reg-io-width = <4>; + status = "okay"; + }; + + pgu0: pgu@c9000000 { + compatible = "snps,arcpgufb"; + reg = <0xc9000000 0x400>; + }; + + ps2: ps2@c9001000 { + compatible = "snps,arc_ps2"; + reg = <0xc9000400 0x14>; + interrupts = <13>; + interrupt-names = "arc_ps2_irq"; + }; + + eth0: ethernet@c0003000 { + compatible = "snps,oscilan"; + reg = <0xc0003000 0x44>; + interrupts = <7>, <8>; + interrupt-names = "rx", "tx"; + }; + }; +}; diff --git a/arch/arc/configs/fpga_defconfig b/arch/arc/configs/fpga_defconfig index b8698067ebbe..95350be6ef6f 100644 --- a/arch/arc/configs/fpga_defconfig +++ b/arch/arc/configs/fpga_defconfig @@ -9,7 +9,7 @@ CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y -CONFIG_INITRAMFS_SOURCE="../arc_initramfs" +CONFIG_INITRAMFS_SOURCE="../arc_initramfs/" CONFIG_KALLSYMS_ALL=y CONFIG_EMBEDDED=y # CONFIG_SLUB_DEBUG is not set @@ -24,6 +24,7 @@ CONFIG_ARC_PLAT_FPGA_LEGACY=y CONFIG_ARC_BOARD_ML509=y # CONFIG_ARC_HAS_RTSC is not set CONFIG_ARC_BUILTIN_DTB_NAME="angel4" +CONFIG_PREEMPT=y # CONFIG_COMPACTION is not set # CONFIG_CROSS_MEMORY_ATTACH is not set CONFIG_NET=y diff --git a/arch/arc/configs/nsimosci_defconfig b/arch/arc/configs/nsimosci_defconfig new file mode 100644 index 000000000000..446c96c24eff --- /dev/null +++ b/arch/arc/configs/nsimosci_defconfig @@ -0,0 +1,75 @@ +CONFIG_CROSS_COMPILE="arc-elf32-" +# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_DEFAULT_HOSTNAME="ARCLinux" +# CONFIG_SWAP is not set +CONFIG_HIGH_RES_TIMERS=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_NAMESPACES=y +# CONFIG_UTS_NS is not set +# CONFIG_PID_NS is not set +CONFIG_BLK_DEV_INITRD=y +CONFIG_INITRAMFS_SOURCE="../arc_initramfs" +CONFIG_KALLSYMS_ALL=y +CONFIG_EMBEDDED=y +# CONFIG_SLUB_DEBUG is not set +# CONFIG_COMPAT_BRK is not set +CONFIG_KPROBES=y +CONFIG_MODULES=y +# CONFIG_LBDAF is not set +# CONFIG_BLK_DEV_BSG is not set +# CONFIG_IOSCHED_DEADLINE is not set +# CONFIG_IOSCHED_CFQ is not set +CONFIG_ARC_PLAT_FPGA_LEGACY=y +CONFIG_ARC_BOARD_ML509=y +# CONFIG_ARC_IDE is not set +# CONFIG_ARCTANGENT_EMAC is not set +# CONFIG_ARC_HAS_RTSC is not set +CONFIG_ARC_BUILTIN_DTB_NAME="nsimosci" +# CONFIG_COMPACTION is not set +# CONFIG_CROSS_MEMORY_ATTACH is not set +CONFIG_NET=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_UNIX_DIAG=y +CONFIG_NET_KEY=y +CONFIG_INET=y +# CONFIG_IPV6 is not set +# CONFIG_STANDALONE is not set +# CONFIG_PREVENT_FIRMWARE_BUILD is not set +# CONFIG_FIRMWARE_IN_KERNEL is not set +# CONFIG_BLK_DEV is not set +CONFIG_NETDEVICES=y +# CONFIG_INPUT_MOUSEDEV_PSAUX is not set +# CONFIG_MOUSE_PS2_ALPS is not set +# CONFIG_MOUSE_PS2_LOGIPS2PP is not set +# CONFIG_MOUSE_PS2_SYNAPTICS is not set +# CONFIG_MOUSE_PS2_TRACKPOINT is not set +CONFIG_MOUSE_PS2_TOUCHKIT=y +# CONFIG_SERIO_I8042 is not set +# CONFIG_SERIO_SERPORT is not set +CONFIG_SERIO_ARC_PS2=y +# CONFIG_LEGACY_PTYS is not set +# CONFIG_DEVKMEM is not set +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_DW=y +CONFIG_SERIAL_ARC=y +CONFIG_SERIAL_ARC_CONSOLE=y +# CONFIG_HW_RANDOM is not set +# CONFIG_HWMON is not set +CONFIG_FB=y +# CONFIG_VGA_CONSOLE is not set +CONFIG_FRAMEBUFFER_CONSOLE=y +CONFIG_LOGO=y +# CONFIG_HID is not set +# CONFIG_USB_SUPPORT is not set +# CONFIG_IOMMU_SUPPORT is not set +CONFIG_EXT2_FS=y +CONFIG_EXT2_FS_XATTR=y +CONFIG_TMPFS=y +# CONFIG_MISC_FILESYSTEMS is not set +CONFIG_NFS_FS=y +# CONFIG_ENABLE_WARN_DEPRECATED is not set +# CONFIG_ENABLE_MUST_CHECK is not set +CONFIG_XZ_DEC=y diff --git a/arch/arc/configs/tb10x_defconfig b/arch/arc/configs/tb10x_defconfig new file mode 100644 index 000000000000..4fa5cd9f2202 --- /dev/null +++ b/arch/arc/configs/tb10x_defconfig @@ -0,0 +1,117 @@ +CONFIG_CROSS_COMPILE="arc-elf32-" +# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_DEFAULT_HOSTNAME="tb10x" +CONFIG_SYSVIPC=y +CONFIG_POSIX_MQUEUE=y +CONFIG_HIGH_RES_TIMERS=y +CONFIG_BSD_PROCESS_ACCT=y +CONFIG_BSD_PROCESS_ACCT_V3=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_LOG_BUF_SHIFT=16 +CONFIG_BLK_DEV_INITRD=y +CONFIG_INITRAMFS_SOURCE="../tb10x-rootfs.cpio" +CONFIG_INITRAMFS_ROOT_UID=2100 +CONFIG_INITRAMFS_ROOT_GID=501 +# CONFIG_RD_GZIP is not set +CONFIG_SYSCTL_SYSCALL=y +CONFIG_KALLSYMS_ALL=y +# CONFIG_AIO is not set +CONFIG_EMBEDDED=y +# CONFIG_COMPAT_BRK is not set +CONFIG_SLAB=y +CONFIG_MODULES=y +CONFIG_MODULE_FORCE_LOAD=y +CONFIG_MODULE_UNLOAD=y +# CONFIG_BLOCK is not set +CONFIG_ARC_PLAT_TB10X=y +CONFIG_ARC_CACHE_LINE_SHIFT=5 +# CONFIG_ARC_HAS_RTSC is not set +CONFIG_ARC_STACK_NONEXEC=y +CONFIG_HZ=250 +CONFIG_ARC_BUILTIN_DTB_NAME="abilis_tb100_dvk" +CONFIG_PREEMPT_VOLUNTARY=y +# CONFIG_COMPACTION is not set +# CONFIG_CROSS_MEMORY_ATTACH is not set +CONFIG_NET=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_INET_XFRM_MODE_TRANSPORT is not set +# CONFIG_INET_XFRM_MODE_TUNNEL is not set +# CONFIG_INET_XFRM_MODE_BEET is not set +# CONFIG_INET_LRO is not set +# CONFIG_INET_DIAG is not set +# CONFIG_IPV6 is not set +# CONFIG_WIRELESS is not set +# CONFIG_FIRMWARE_IN_KERNEL is not set +CONFIG_PROC_DEVICETREE=y +CONFIG_NETDEVICES=y +# CONFIG_NET_CADENCE is not set +# CONFIG_NET_VENDOR_BROADCOM is not set +# CONFIG_NET_VENDOR_INTEL is not set +# CONFIG_NET_VENDOR_MARVELL is not set +# CONFIG_NET_VENDOR_MICREL is not set +# CONFIG_NET_VENDOR_NATSEMI is not set +# CONFIG_NET_VENDOR_SEEQ is not set +CONFIG_STMMAC_ETH=y +CONFIG_STMMAC_DEBUG_FS=y +CONFIG_STMMAC_DA=y +CONFIG_STMMAC_CHAINED=y +# CONFIG_NET_VENDOR_WIZNET is not set +# CONFIG_WLAN is not set +# CONFIG_INPUT is not set +# CONFIG_SERIO is not set +# CONFIG_VT is not set +CONFIG_DEVPTS_MULTIPLE_INSTANCES=y +# CONFIG_LEGACY_PTYS is not set +# CONFIG_DEVKMEM is not set +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_NR_UARTS=1 +CONFIG_SERIAL_8250_RUNTIME_UARTS=1 +CONFIG_SERIAL_8250_DW=y +# CONFIG_HW_RANDOM is not set +CONFIG_I2C=y +# CONFIG_I2C_COMPAT is not set +CONFIG_I2C_DESIGNWARE_PLATFORM=y +CONFIG_GPIO_SYSFS=y +# CONFIG_HWMON is not set +# CONFIG_USB_SUPPORT is not set +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +CONFIG_LEDS_GPIO=y +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_TIMER=y +CONFIG_LEDS_TRIGGER_ONESHOT=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=y +CONFIG_LEDS_TRIGGER_CPU=y +CONFIG_LEDS_TRIGGER_GPIO=y +CONFIG_LEDS_TRIGGER_DEFAULT_ON=y +CONFIG_LEDS_TRIGGER_TRANSIENT=y +CONFIG_DMADEVICES=y +CONFIG_DW_DMAC=y +CONFIG_NET_DMA=y +CONFIG_ASYNC_TX_DMA=y +# CONFIG_IOMMU_SUPPORT is not set +# CONFIG_DNOTIFY is not set +CONFIG_PROC_KCORE=y +CONFIG_TMPFS=y +CONFIG_CONFIGFS_FS=y +# CONFIG_MISC_FILESYSTEMS is not set +# CONFIG_NETWORK_FILESYSTEMS is not set +# CONFIG_ENABLE_WARN_DEPRECATED is not set +CONFIG_MAGIC_SYSRQ=y +CONFIG_STRIP_ASM_SYMS=y +CONFIG_DEBUG_FS=y +CONFIG_HEADERS_CHECK=y +CONFIG_DEBUG_SECTION_MISMATCH=y +CONFIG_DETECT_HUNG_TASK=y +CONFIG_SCHEDSTATS=y +CONFIG_TIMER_STATS=y +CONFIG_DEBUG_INFO=y +CONFIG_DEBUG_MEMORY_INIT=y +CONFIG_DEBUG_STACKOVERFLOW=y +# CONFIG_CRYPTO_ANSI_CPRNG is not set +# CONFIG_CRYPTO_HW is not set diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild index 48af742f8b5a..d8dd660898b9 100644 --- a/arch/arc/include/asm/Kbuild +++ b/arch/arc/include/asm/Kbuild @@ -32,7 +32,6 @@ generic-y += resource.h generic-y += scatterlist.h generic-y += sembuf.h generic-y += shmbuf.h -generic-y += shmparam.h generic-y += siginfo.h generic-y += socket.h generic-y += sockios.h diff --git a/arch/arc/include/asm/cache.h b/arch/arc/include/asm/cache.h index 6632273861fd..d5555fe4742a 100644 --- a/arch/arc/include/asm/cache.h +++ b/arch/arc/include/asm/cache.h @@ -55,9 +55,6 @@ : "r"(data), "r"(ptr)); \ }) -/* used to give SHMLBA a value to avoid Cache Aliasing */ -extern unsigned int ARC_shmlba; - #define ARCH_DMA_MINALIGN L1_CACHE_BYTES /* diff --git a/arch/arc/include/asm/cacheflush.h b/arch/arc/include/asm/cacheflush.h index 97ee96f26505..9f841af41092 100644 --- a/arch/arc/include/asm/cacheflush.h +++ b/arch/arc/include/asm/cacheflush.h @@ -19,13 +19,24 @@ #define _ASM_CACHEFLUSH_H #include +#include + +/* + * Semantically we need this because icache doesn't snoop dcache/dma. + * However ARC Cache flush requires paddr as well as vaddr, latter not available + * in the flush_icache_page() API. So we no-op it but do the equivalent work + * in update_mmu_cache() + */ +#define flush_icache_page(vma, page) void flush_cache_all(void); void flush_icache_range(unsigned long start, unsigned long end); -void flush_icache_page(struct vm_area_struct *vma, struct page *page); -void flush_icache_range_vaddr(unsigned long paddr, unsigned long u_vaddr, - int len); +void __sync_icache_dcache(unsigned long paddr, unsigned long vaddr, int len); +void __inv_icache_page(unsigned long paddr, unsigned long vaddr); +void ___flush_dcache_page(unsigned long paddr, unsigned long vaddr); +#define __flush_dcache_page(p, v) \ + ___flush_dcache_page((unsigned long)p, (unsigned long)v) #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1 @@ -42,23 +53,60 @@ void dma_cache_wback(unsigned long start, unsigned long sz); #define flush_cache_vmap(start, end) flush_cache_all() #define flush_cache_vunmap(start, end) flush_cache_all() -/* - * VM callbacks when entire/range of user-space V-P mappings are - * torn-down/get-invalidated - * - * Currently we don't support D$ aliasing configs for our VIPT caches - * NOPS for VIPT Cache with non-aliasing D$ configurations only - */ -#define flush_cache_dup_mm(mm) /* called on fork */ +#define flush_cache_dup_mm(mm) /* called on fork (VIVT only) */ + +#ifndef CONFIG_ARC_CACHE_VIPT_ALIASING + #define flush_cache_mm(mm) /* called on munmap/exit */ #define flush_cache_range(mm, u_vstart, u_vend) #define flush_cache_page(vma, u_vaddr, pfn) /* PF handling/COW-break */ +#else /* VIPT aliasing dcache */ + +/* To clear out stale userspace mappings */ +void flush_cache_mm(struct mm_struct *mm); +void flush_cache_range(struct vm_area_struct *vma, + unsigned long start,unsigned long end); +void flush_cache_page(struct vm_area_struct *vma, + unsigned long user_addr, unsigned long page); + +/* + * To make sure that userspace mapping is flushed to memory before + * get_user_pages() uses a kernel mapping to access the page + */ +#define ARCH_HAS_FLUSH_ANON_PAGE +void flush_anon_page(struct vm_area_struct *vma, + struct page *page, unsigned long u_vaddr); + +#endif /* CONFIG_ARC_CACHE_VIPT_ALIASING */ + +/* + * Simple wrapper over config option + * Bootup code ensures that hardware matches kernel configuration + */ +static inline int cache_is_vipt_aliasing(void) +{ +#ifdef CONFIG_ARC_CACHE_VIPT_ALIASING + return 1; +#else + return 0; +#endif +} + +#define CACHE_COLOR(addr) (((unsigned long)(addr) >> (PAGE_SHIFT)) & 3) + +/* + * checks if two addresses (after page aligning) index into same cache set + */ +#define addr_not_cache_congruent(addr1, addr2) \ + cache_is_vipt_aliasing() ? \ + (CACHE_COLOR(addr1) != CACHE_COLOR(addr2)) : 0 \ + #define copy_to_user_page(vma, page, vaddr, dst, src, len) \ do { \ memcpy(dst, src, len); \ if (vma->vm_flags & VM_EXEC) \ - flush_icache_range_vaddr((unsigned long)(dst), vaddr, len);\ + __sync_icache_dcache((unsigned long)(dst), vaddr, len); \ } while (0) #define copy_from_user_page(vma, page, vaddr, dst, src, len) \ diff --git a/arch/arc/include/asm/irq.h b/arch/arc/include/asm/irq.h index 4c588f9820cf..57898a17eb82 100644 --- a/arch/arc/include/asm/irq.h +++ b/arch/arc/include/asm/irq.h @@ -9,7 +9,8 @@ #ifndef __ASM_ARC_IRQ_H #define __ASM_ARC_IRQ_H -#define NR_IRQS 32 +#define NR_CPU_IRQS 32 /* number of interrupt lines of ARC770 CPU */ +#define NR_IRQS 128 /* allow some CPU external IRQ handling */ /* Platform Independent IRQs */ #define TIMER0_IRQ 3 diff --git a/arch/arc/include/asm/page.h b/arch/arc/include/asm/page.h index bdf546104551..374a35514116 100644 --- a/arch/arc/include/asm/page.h +++ b/arch/arc/include/asm/page.h @@ -16,13 +16,27 @@ #define get_user_page(vaddr) __get_free_page(GFP_KERNEL) #define free_user_page(page, addr) free_page(addr) -/* TBD: for now don't worry about VIPT D$ aliasing */ #define clear_page(paddr) memset((paddr), 0, PAGE_SIZE) #define copy_page(to, from) memcpy((to), (from), PAGE_SIZE) +#ifndef CONFIG_ARC_CACHE_VIPT_ALIASING + #define clear_user_page(addr, vaddr, pg) clear_page(addr) #define copy_user_page(vto, vfrom, vaddr, pg) copy_page(vto, vfrom) +#else /* VIPT aliasing dcache */ + +struct vm_area_struct; +struct page; + +#define __HAVE_ARCH_COPY_USER_HIGHPAGE + +void copy_user_highpage(struct page *to, struct page *from, + unsigned long u_vaddr, struct vm_area_struct *vma); +void clear_user_page(void *to, unsigned long u_vaddr, struct page *page); + +#endif /* CONFIG_ARC_CACHE_VIPT_ALIASING */ + #undef STRICT_MM_TYPECHECKS #ifdef STRICT_MM_TYPECHECKS diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h index b7e36684c091..1cc4720faccb 100644 --- a/arch/arc/include/asm/pgtable.h +++ b/arch/arc/include/asm/pgtable.h @@ -395,6 +395,9 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, #include +/* to cope with aliasing VIPT cache */ +#define HAVE_ARCH_UNMAPPED_AREA + /* * No page table caches to initialise */ diff --git a/arch/arc/include/asm/serial.h b/arch/arc/include/asm/serial.h index 4dff5a1e4128..602b0970a764 100644 --- a/arch/arc/include/asm/serial.h +++ b/arch/arc/include/asm/serial.h @@ -22,4 +22,14 @@ #define BASE_BAUD (arc_get_core_freq() / 16) +/* + * This is definitely going to break early 8250 consoles on multi-platform + * images but hey, it won't add any code complexity for a debug feature of + * one broken driver. + */ +#ifdef CONFIG_ARC_PLAT_TB10X +#undef BASE_BAUD +#define BASE_BAUD (arc_get_core_freq() / 16 / 3) +#endif + #endif /* _ASM_ARC_SERIAL_H */ diff --git a/arch/arc/include/asm/shmparam.h b/arch/arc/include/asm/shmparam.h new file mode 100644 index 000000000000..fffeecc04270 --- /dev/null +++ b/arch/arc/include/asm/shmparam.h @@ -0,0 +1,18 @@ +/* + * Copyright (C) 2013 Synopsys, Inc. (www.synopsys.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifndef __ARC_ASM_SHMPARAM_H +#define __ARC_ASM_SHMPARAM_H + +/* Handle upto 2 cache bins */ +#define SHMLBA (2 * PAGE_SIZE) + +/* Enforce SHMLBA in shmat */ +#define __ARCH_FORCE_SHMLBA + +#endif diff --git a/arch/arc/include/asm/tlb.h b/arch/arc/include/asm/tlb.h index 3eb2ce0bdfa3..85b6df839bd7 100644 --- a/arch/arc/include/asm/tlb.h +++ b/arch/arc/include/asm/tlb.h @@ -21,20 +21,35 @@ #ifndef __ASSEMBLY__ -#define tlb_flush(tlb) local_flush_tlb_mm((tlb)->mm) +#define tlb_flush(tlb) \ +do { \ + if (tlb->fullmm) \ + flush_tlb_mm((tlb)->mm); \ +} while (0) /* * This pair is called at time of munmap/exit to flush cache and TLB entries * for mappings being torn down. - * 1) cache-flush part -implemented via tlb_start_vma( ) can be NOP (for now) - * as we don't support aliasing configs in our VIPT D$. - * 2) tlb-flush part - implemted via tlb_end_vma( ) can be NOP as well- - * albiet for difft reasons - its better handled by moving to new ASID + * 1) cache-flush part -implemented via tlb_start_vma( ) for VIPT aliasing D$ + * 2) tlb-flush part - implemted via tlb_end_vma( ) flushes the TLB range * * Note, read http://lkml.org/lkml/2004/1/15/6 */ +#ifndef CONFIG_ARC_CACHE_VIPT_ALIASING #define tlb_start_vma(tlb, vma) -#define tlb_end_vma(tlb, vma) +#else +#define tlb_start_vma(tlb, vma) \ +do { \ + if (!tlb->fullmm) \ + flush_cache_range(vma, vma->vm_start, vma->vm_end); \ +} while(0) +#endif + +#define tlb_end_vma(tlb, vma) \ +do { \ + if (!tlb->fullmm) \ + flush_tlb_range(vma, vma->vm_start, vma->vm_end); \ +} while (0) #define __tlb_remove_tlb_entry(tlb, ptep, address) diff --git a/arch/arc/kernel/asm-offsets.c b/arch/arc/kernel/asm-offsets.c index 0dc148ebce74..7dcda7025241 100644 --- a/arch/arc/kernel/asm-offsets.c +++ b/arch/arc/kernel/asm-offsets.c @@ -11,9 +11,9 @@ #include #include #include +#include #include #include -#include int main(void) { diff --git a/arch/arc/kernel/clk.c b/arch/arc/kernel/clk.c index 66ce0dc917fb..10c7b0b5a079 100644 --- a/arch/arc/kernel/clk.c +++ b/arch/arc/kernel/clk.c @@ -8,7 +8,7 @@ #include -unsigned long core_freq = 800000000; +unsigned long core_freq = 80000000; /* * As of now we default to device-tree provided clock diff --git a/arch/arc/kernel/disasm.c b/arch/arc/kernel/disasm.c index d14764ae2c60..b8a549c4f540 100644 --- a/arch/arc/kernel/disasm.c +++ b/arch/arc/kernel/disasm.c @@ -12,8 +12,8 @@ #include #include #include +#include #include -#include #if defined(CONFIG_KGDB) || defined(CONFIG_ARC_MISALIGN_ACCESS) || \ defined(CONFIG_KPROBES) diff --git a/arch/arc/kernel/entry.S b/arch/arc/kernel/entry.S index 91eeab81f52d..0c6d664d4a83 100644 --- a/arch/arc/kernel/entry.S +++ b/arch/arc/kernel/entry.S @@ -393,12 +393,14 @@ ARC_ENTRY EV_TLBProtV #ifdef CONFIG_ARC_MISALIGN_ACCESS SAVE_CALLEE_SAVED_USER mov r3, sp ; callee_regs -#endif bl do_misaligned_access -#ifdef CONFIG_ARC_MISALIGN_ACCESS - DISCARD_CALLEE_SAVED_USER + ; TBD: optimize - do this only if a callee reg was involved + ; either a dst of emulated LD/ST or src with address-writeback + RESTORE_CALLEE_SAVED_USER +#else + bl do_misaligned_error #endif b ret_from_exception diff --git a/arch/arc/kernel/irq.c b/arch/arc/kernel/irq.c index 551c10dff481..8115fa531575 100644 --- a/arch/arc/kernel/irq.c +++ b/arch/arc/kernel/irq.c @@ -11,6 +11,8 @@ #include #include #include +#include +#include "../../drivers/irqchip/irqchip.h" #include #include #include @@ -26,7 +28,7 @@ * -Disable all IRQs (on CPU side) * -Optionally, setup the High priority Interrupts as Level 2 IRQs */ -void __init arc_init_IRQ(void) +void __cpuinit arc_init_IRQ(void) { int level_mask = 0; @@ -97,15 +99,13 @@ static const struct irq_domain_ops arc_intc_domain_ops = { static struct irq_domain *root_domain; -void __init init_onchip_IRQ(void) +static int __init +init_onchip_IRQ(struct device_node *intc, struct device_node *parent) { - struct device_node *intc = NULL; + if (parent) + panic("DeviceTree incore intc not a root irq controller\n"); - intc = of_find_compatible_node(NULL, NULL, "snps,arc700-intc"); - if(!intc) - panic("DeviceTree Missing incore intc\n"); - - root_domain = irq_domain_add_legacy(intc, NR_IRQS, 0, 0, + root_domain = irq_domain_add_legacy(intc, NR_CPU_IRQS, 0, 0, &arc_intc_domain_ops, NULL); if (!root_domain) @@ -113,8 +113,12 @@ void __init init_onchip_IRQ(void) /* with this we don't need to export root_domain */ irq_set_default_host(root_domain); + + return 0; } +IRQCHIP_DECLARE(arc_intc, "snps,arc700-intc", init_onchip_IRQ); + /* * Late Interrupt system init called from start_kernel for Boot CPU only * @@ -123,12 +127,13 @@ void __init init_onchip_IRQ(void) */ void __init init_IRQ(void) { - init_onchip_IRQ(); - /* Any external intc can be setup here */ if (machine_desc->init_irq) machine_desc->init_irq(); + /* process the entire interrupt tree in one go */ + irqchip_init(); + #ifdef CONFIG_SMP /* Master CPU can initialize it's side of IPI */ if (machine_desc->init_smp) diff --git a/arch/arc/kernel/kprobes.c b/arch/arc/kernel/kprobes.c index 3bfeacb674de..5a7b80e2d883 100644 --- a/arch/arc/kernel/kprobes.c +++ b/arch/arc/kernel/kprobes.c @@ -10,7 +10,6 @@ #include #include #include -#include #include #include #include diff --git a/arch/arc/kernel/module.c b/arch/arc/kernel/module.c index cdd359352c0a..376e04622962 100644 --- a/arch/arc/kernel/module.c +++ b/arch/arc/kernel/module.c @@ -47,7 +47,7 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, } } #endif - return 0; + return 0; } void module_arch_cleanup(struct module *mod) @@ -141,5 +141,5 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs, mod->arch.unw_info = unw; } #endif - return 0; + return 0; } diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c index 2d95ac07df7b..b2b3731dd1e9 100644 --- a/arch/arc/kernel/setup.c +++ b/arch/arc/kernel/setup.c @@ -14,14 +14,13 @@ #include #include #include +#include #include #include #include -#include #include #include #include -#include #include #include #include @@ -32,14 +31,14 @@ int running_on_hw = 1; /* vs. on ISS */ char __initdata command_line[COMMAND_LINE_SIZE]; -struct machine_desc *machine_desc __initdata; +struct machine_desc *machine_desc __cpuinitdata; struct task_struct *_current_task[NR_CPUS]; /* For stack switching */ struct cpuinfo_arc cpuinfo_arc700[NR_CPUS]; -void __init read_arc_build_cfg_regs(void) +void __cpuinit read_arc_build_cfg_regs(void) { struct bcr_perip uncached_space; struct cpuinfo_arc *cpu = &cpuinfo_arc700[smp_processor_id()]; @@ -238,7 +237,7 @@ char *arc_extn_mumbojumbo(int cpu_id, char *buf, int len) return buf; } -void __init arc_chk_ccms(void) +void __cpuinit arc_chk_ccms(void) { #if defined(CONFIG_ARC_HAS_DCCM) || defined(CONFIG_ARC_HAS_ICCM) struct cpuinfo_arc *cpu = &cpuinfo_arc700[smp_processor_id()]; @@ -273,7 +272,7 @@ void __init arc_chk_ccms(void) * hardware has dedicated regs which need to be saved/restored on ctx-sw * (Single Precision uses core regs), thus kernel is kind of oblivious to it */ -void __init arc_chk_fpu(void) +void __cpuinit arc_chk_fpu(void) { struct cpuinfo_arc *cpu = &cpuinfo_arc700[smp_processor_id()]; @@ -294,7 +293,7 @@ void __init arc_chk_fpu(void) * such as only for boot CPU etc */ -void __init setup_processor(void) +void __cpuinit setup_processor(void) { char str[512]; int cpu_id = smp_processor_id(); @@ -319,23 +318,20 @@ void __init setup_processor(void) void __init setup_arch(char **cmdline_p) { + /* This also populates @boot_command_line from /bootargs */ + machine_desc = setup_machine_fdt(__dtb_start); + if (!machine_desc) + panic("Embedded DT invalid\n"); + + /* Append any u-boot provided cmdline */ #ifdef CONFIG_CMDLINE_UBOOT - /* Make sure that a whitespace is inserted before */ - strlcat(command_line, " ", sizeof(command_line)); + /* Add a whitespace seperator between the 2 cmdlines */ + strlcat(boot_command_line, " ", COMMAND_LINE_SIZE); + strlcat(boot_command_line, command_line, COMMAND_LINE_SIZE); #endif - /* - * Append .config cmdline to base command line, which might already - * contain u-boot "bootargs" (handled by head.S, if so configured) - */ - strlcat(command_line, CONFIG_CMDLINE, sizeof(command_line)); /* Save unparsed command line copy for /proc/cmdline */ - strlcpy(boot_command_line, command_line, COMMAND_LINE_SIZE); - *cmdline_p = command_line; - - machine_desc = setup_machine_fdt(__dtb_start); - if (!machine_desc) - panic("Embedded DT invalid\n"); + *cmdline_p = boot_command_line; /* To force early parsing of things like mem=xxx */ parse_early_param(); diff --git a/arch/arc/kernel/time.c b/arch/arc/kernel/time.c index f13f72807aa5..09f4309aa2c0 100644 --- a/arch/arc/kernel/time.c +++ b/arch/arc/kernel/time.c @@ -33,7 +33,6 @@ #include #include #include -#include #include #include #include diff --git a/arch/arc/kernel/traps.c b/arch/arc/kernel/traps.c index 7496995371e8..0471d9c9dd54 100644 --- a/arch/arc/kernel/traps.c +++ b/arch/arc/kernel/traps.c @@ -16,11 +16,12 @@ #include #include #include -#include +#include +#include +#include #include -#include #include -#include +#include void __init trap_init(void) { @@ -83,6 +84,7 @@ DO_ERROR_INFO(SIGILL, "Invalid Extn Insn", do_extension_fault, ILL_ILLOPC) DO_ERROR_INFO(SIGILL, "Illegal Insn (or Seq)", insterror_is_error, ILL_ILLOPC) DO_ERROR_INFO(SIGBUS, "Invalid Mem Access", do_memory_error, BUS_ADRERR) DO_ERROR_INFO(SIGTRAP, "Breakpoint Set", trap_is_brkpt, TRAP_BRKPT) +DO_ERROR_INFO(SIGBUS, "Misaligned Access", do_misaligned_error, BUS_ADRALN) #ifdef CONFIG_ARC_MISALIGN_ACCESS /* @@ -91,21 +93,11 @@ DO_ERROR_INFO(SIGTRAP, "Breakpoint Set", trap_is_brkpt, TRAP_BRKPT) int do_misaligned_access(unsigned long cause, unsigned long address, struct pt_regs *regs, struct callee_regs *cregs) { - if (misaligned_fixup(address, regs, cause, cregs) != 0) { - siginfo_t info; - - info.si_signo = SIGBUS; - info.si_errno = 0; - info.si_code = BUS_ADRALN; - info.si_addr = (void __user *)address; - return handle_exception(cause, "Misaligned Access", regs, - &info); - } + if (misaligned_fixup(address, regs, cause, cregs) != 0) + return do_misaligned_error(cause, address, regs); + return 0; } - -#else -DO_ERROR_INFO(SIGSEGV, "Misaligned Access", do_misaligned_access, SEGV_ACCERR) #endif /* diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c index 0aec01985bf9..11c301b81c92 100644 --- a/arch/arc/kernel/troubleshoot.c +++ b/arch/arc/kernel/troubleshoot.c @@ -26,7 +26,6 @@ static noinline void print_reg_file(long *reg_rev, int start_num) char buf[512]; int n = 0, len = sizeof(buf); - /* weird loop because pt_regs regs rev r12..r0, r25..r13 */ for (i = start_num; i < start_num + 13; i++) { n += scnprintf(buf + n, len - n, "r%02u: 0x%08lx\t", i, (unsigned long)*reg_rev); @@ -34,13 +33,18 @@ static noinline void print_reg_file(long *reg_rev, int start_num) if (((i + 1) % 3) == 0) n += scnprintf(buf + n, len - n, "\n"); + /* because pt_regs has regs reversed: r12..r0, r25..r13 */ reg_rev--; } if (start_num != 0) n += scnprintf(buf + n, len - n, "\n\n"); - pr_info("%s", buf); + /* To continue printing callee regs on same line as scratch regs */ + if (start_num == 0) + pr_info("%s", buf); + else + pr_cont("%s\n", buf); } static void show_callee_regs(struct callee_regs *cregs) @@ -83,6 +87,10 @@ static void show_faulting_vma(unsigned long address, char *buf) dev_t dev = 0; char *nm = buf; + /* can't use print_vma_addr() yet as it doesn't check for + * non-inclusive vma + */ + vma = find_vma(current->active_mm, address); /* check against the find_vma( ) behaviour which returns the next VMA @@ -98,10 +106,13 @@ static void show_faulting_vma(unsigned long address, char *buf) ino = inode->i_ino; } pr_info(" @off 0x%lx in [%s]\n" - " VMA: 0x%08lx to 0x%08lx\n\n", - address - vma->vm_start, nm, vma->vm_start, vma->vm_end); - } else + " VMA: 0x%08lx to 0x%08lx\n", + vma->vm_start < TASK_UNMAPPED_BASE ? + address : address - vma->vm_start, + nm, vma->vm_start, vma->vm_end); + } else { pr_info(" @No matching VMA found\n"); + } } static void show_ecr_verbose(struct pt_regs *regs) @@ -110,7 +121,7 @@ static void show_ecr_verbose(struct pt_regs *regs) unsigned long address; cause_reg = current->thread.cause_code; - pr_info("\n[ECR]: 0x%08x => ", cause_reg); + pr_info("\n[ECR ]: 0x%08x => ", cause_reg); /* For Data fault, this is data address not instruction addr */ address = current->thread.fault_address; @@ -120,7 +131,7 @@ static void show_ecr_verbose(struct pt_regs *regs) /* For DTLB Miss or ProtV, display the memory involved too */ if (vec == ECR_V_DTLB_MISS) { - pr_cont("Invalid (%s) @ 0x%08lx by insn @ 0x%08lx\n", + pr_cont("Invalid %s 0x%08lx by insn @ 0x%08lx\n", (cause_code == 0x01) ? "Read From" : ((cause_code == 0x02) ? "Write to" : "EX"), address, regs->ret); @@ -168,20 +179,23 @@ void show_regs(struct pt_regs *regs) if (current->thread.cause_code) show_ecr_verbose(regs); - pr_info("[EFA]: 0x%08lx\n", current->thread.fault_address); - pr_info("[ERET]: 0x%08lx (PC of Faulting Instr)\n", regs->ret); + pr_info("[EFA ]: 0x%08lx\n[BLINK ]: %pS\n[ERET ]: %pS\n", + current->thread.fault_address, + (void *)regs->blink, (void *)regs->ret); - show_faulting_vma(regs->ret, buf); /* faulting code, not data */ + if (user_mode(regs)) + show_faulting_vma(regs->ret, buf); /* faulting code, not data */ - /* can't use print_vma_addr() yet as it doesn't check for - * non-inclusive vma - */ + pr_info("[STAT32]: 0x%08lx", regs->status32); + +#define STS_BIT(r, bit) r->status32 & STATUS_##bit##_MASK ? #bit : "" + if (!user_mode(regs)) + pr_cont(" : %2s %2s %2s %2s %2s\n", + STS_BIT(regs, AE), STS_BIT(regs, A2), STS_BIT(regs, A1), + STS_BIT(regs, E2), STS_BIT(regs, E1)); - /* print special regs */ - pr_info("status32: 0x%08lx\n", regs->status32); - pr_info(" SP: 0x%08lx\tFP: 0x%08lx\n", regs->sp, regs->fp); - pr_info("BTA: 0x%08lx\tBLINK: 0x%08lx\n", - regs->bta, regs->blink); + pr_info("BTA: 0x%08lx\t SP: 0x%08lx\t FP: 0x%08lx\n", + regs->bta, regs->sp, regs->fp); pr_info("LPS: 0x%08lx\tLPE: 0x%08lx\tLPC: 0x%08lx\n", regs->lp_start, regs->lp_end, regs->lp_count); diff --git a/arch/arc/mm/Makefile b/arch/arc/mm/Makefile index 168dc146a8f6..ac95cc239c1e 100644 --- a/arch/arc/mm/Makefile +++ b/arch/arc/mm/Makefile @@ -7,4 +7,4 @@ # obj-y := extable.o ioremap.o dma.o fault.o init.o -obj-y += tlb.o tlbex.o cache_arc700.o +obj-y += tlb.o tlbex.o cache_arc700.o mmap.o diff --git a/arch/arc/mm/cache_arc700.c b/arch/arc/mm/cache_arc700.c index 88d617d84234..2f12bca8aef3 100644 --- a/arch/arc/mm/cache_arc700.c +++ b/arch/arc/mm/cache_arc700.c @@ -68,20 +68,11 @@ #include #include #include +#include #include #include #include - -#ifdef CONFIG_ARC_HAS_ICACHE -static void __ic_line_inv_no_alias(unsigned long, int); -static void __ic_line_inv_2_alias(unsigned long, int); -static void __ic_line_inv_4_alias(unsigned long, int); - -/* Holds the ptr to flush routine, dependign on size due to aliasing issues */ -static void (*___flush_icache_rtn) (unsigned long, int); -#endif - char *arc_cache_mumbojumbo(int cpu_id, char *buf, int len) { int n = 0; @@ -109,7 +100,7 @@ char *arc_cache_mumbojumbo(int cpu_id, char *buf, int len) * the cpuinfo structure for later use. * No Validation done here, simply read/convert the BCRs */ -void __init read_decode_cache_bcr(void) +void __cpuinit read_decode_cache_bcr(void) { struct bcr_cache ibcr, dbcr; struct cpuinfo_arc_cache *p_ic, *p_dc; @@ -141,13 +132,14 @@ void __init read_decode_cache_bcr(void) * 3. Enable the Caches, setup default flush mode for D-Cache * 3. Calculate the SHMLBA used by user space */ -void __init arc_cache_init(void) +void __cpuinit arc_cache_init(void) { unsigned int temp; unsigned int cpu = smp_processor_id(); struct cpuinfo_arc_cache *ic = &cpuinfo_arc700[cpu].icache; struct cpuinfo_arc_cache *dc = &cpuinfo_arc700[cpu].dcache; int way_pg_ratio = way_pg_ratio; + int dcache_does_alias; char str[256]; printk(arc_cache_mumbojumbo(0, str, sizeof(str))); @@ -171,30 +163,6 @@ void __init arc_cache_init(void) } #endif - - /* - * if Cache way size is <= page size then no aliasing exhibited - * otherwise ratio determines num of aliases. - * e.g. 32K I$, 2 way set assoc, 8k pg size - * way-sz = 32k/2 = 16k - * way-pg-ratio = 16k/8k = 2, so 2 aliases possible - * (meaning 1 line could be in 2 possible locations). - */ - way_pg_ratio = ic->sz / ARC_ICACHE_WAYS / PAGE_SIZE; - switch (way_pg_ratio) { - case 0: - case 1: - ___flush_icache_rtn = __ic_line_inv_no_alias; - break; - case 2: - ___flush_icache_rtn = __ic_line_inv_2_alias; - break; - case 4: - ___flush_icache_rtn = __ic_line_inv_4_alias; - break; - default: - panic("Unsupported I-Cache Sz\n"); - } #endif /* Enable/disable I-Cache */ @@ -218,9 +186,13 @@ chk_dc: panic("Cache H/W doesn't match kernel Config"); } + dcache_does_alias = (dc->sz / ARC_DCACHE_WAYS) > PAGE_SIZE; + /* check for D-Cache aliasing */ - if ((dc->sz / ARC_DCACHE_WAYS) > PAGE_SIZE) - panic("D$ aliasing not handled right now\n"); + if (dcache_does_alias && !cache_is_vipt_aliasing()) + panic("Enable CONFIG_ARC_CACHE_VIPT_ALIASING\n"); + else if (!dcache_does_alias && cache_is_vipt_aliasing()) + panic("Don't need CONFIG_ARC_CACHE_VIPT_ALIASING\n"); #endif /* Set the default Invalidate Mode to "simpy discard dirty lines" @@ -303,47 +275,57 @@ static inline void __dc_entire_op(const int cacheop) * Per Line Operation on D-Cache * Doesn't deal with type-of-op/IRQ-disabling/waiting-for-flush-to-complete * It's sole purpose is to help gcc generate ZOL + * (aliasing VIPT dcache flushing needs both vaddr and paddr) */ -static inline void __dc_line_loop(unsigned long start, unsigned long sz, - int aux_reg) +static inline void __dc_line_loop(unsigned long paddr, unsigned long vaddr, + unsigned long sz, const int aux_reg) { - int num_lines, slack; + int num_lines; /* Ensure we properly floor/ceil the non-line aligned/sized requests - * and have @start - aligned to cache line and integral @num_lines. + * and have @paddr - aligned to cache line and integral @num_lines. * This however can be avoided for page sized since: - * -@start will be cache-line aligned already (being page aligned) + * -@paddr will be cache-line aligned already (being page aligned) * -@sz will be integral multiple of line size (being page sized). */ if (!(__builtin_constant_p(sz) && sz == PAGE_SIZE)) { - slack = start & ~DCACHE_LINE_MASK; - sz += slack; - start -= slack; + sz += paddr & ~DCACHE_LINE_MASK; + paddr &= DCACHE_LINE_MASK; + vaddr &= DCACHE_LINE_MASK; } num_lines = DIV_ROUND_UP(sz, ARC_DCACHE_LINE_LEN); +#if (CONFIG_ARC_MMU_VER <= 2) + paddr |= (vaddr >> PAGE_SHIFT) & 0x1F; +#endif + while (num_lines-- > 0) { #if (CONFIG_ARC_MMU_VER > 2) /* * Just as for I$, in MMU v3, D$ ops also require * "tag" bits in DC_PTAG, "index" bits in FLDL,IVDL ops - * But we pass phy addr for both. This works since Linux - * doesn't support aliasing configs for D$, yet. - * Thus paddr is enough to provide both tag and index. */ - write_aux_reg(ARC_REG_DC_PTAG, start); + write_aux_reg(ARC_REG_DC_PTAG, paddr); + + write_aux_reg(aux_reg, vaddr); + vaddr += ARC_DCACHE_LINE_LEN; +#else + /* paddr contains stuffed vaddrs bits */ + write_aux_reg(aux_reg, paddr); #endif - write_aux_reg(aux_reg, start); - start += ARC_DCACHE_LINE_LEN; + paddr += ARC_DCACHE_LINE_LEN; } } +/* For kernel mappings cache operation: index is same as paddr */ +#define __dc_line_op_k(p, sz, op) __dc_line_op(p, p, sz, op) + /* * D-Cache : Per Line INV (discard or wback+discard) or FLUSH (wback) */ -static inline void __dc_line_op(unsigned long start, unsigned long sz, - const int cacheop) +static inline void __dc_line_op(unsigned long paddr, unsigned long vaddr, + unsigned long sz, const int cacheop) { unsigned long flags, tmp = tmp; int aux; @@ -366,7 +348,7 @@ static inline void __dc_line_op(unsigned long start, unsigned long sz, else aux = ARC_REG_DC_FLDL; - __dc_line_loop(start, sz, aux); + __dc_line_loop(paddr, vaddr, sz, aux); if (cacheop & OP_FLUSH) /* flush / flush-n-inv both wait */ wait_for_flush(); @@ -381,7 +363,8 @@ static inline void __dc_line_op(unsigned long start, unsigned long sz, #else #define __dc_entire_op(cacheop) -#define __dc_line_op(start, sz, cacheop) +#define __dc_line_op(paddr, vaddr, sz, cacheop) +#define __dc_line_op_k(paddr, sz, cacheop) #endif /* CONFIG_ARC_HAS_DCACHE */ @@ -391,75 +374,38 @@ static inline void __dc_line_op(unsigned long start, unsigned long sz, /* * I-Cache Aliasing in ARC700 VIPT caches * - * For fetching code from I$, ARC700 uses vaddr (embedded in program code) - * to "index" into SET of cache-line and paddr from MMU to match the TAG - * in the WAYS of SET. - * - * However the CDU iterface (to flush/inv) lines from software, only takes - * paddr (to have simpler hardware interface). For simpler cases, using paddr - * alone suffices. - * e.g. 2-way-set-assoc, 16K I$ (8k MMU pg sz, 32b cache line size): - * way_sz = cache_sz / num_ways = 16k/2 = 8k - * num_sets = way_sz / line_sz = 8k/32 = 256 => 8 bits - * Ignoring the bottom 5 bits corresp to the off within a 32b cacheline, - * bits req for calc set-index = bits 12:5 (0 based). Since this range fits - * inside the bottom 13 bits of paddr, which are same for vaddr and paddr - * (with 8k pg sz), paddr alone can be safely used by CDU to unambigously - * locate a cache-line. + * ARC VIPT I-cache uses vaddr to index into cache and paddr to match the tag. + * The orig Cache Management Module "CDU" only required paddr to invalidate a + * certain line since it sufficed as index in Non-Aliasing VIPT cache-geometry. + * Infact for distinct V1,V2,P: all of {V1-P},{V2-P},{P-P} would end up fetching + * the exact same line. * - * However for a difft sized cache, say 32k I$, above math yields need - * for 14 bits of vaddr to locate a cache line, which can't be provided by - * paddr, since the bit 13 (0 based) might differ between the two. - * - * This lack of extra bits needed for correct line addressing, defines the - * classical problem of Cache aliasing with VIPT architectures - * num_aliases = 1 << extra_bits - * e.g. 2-way-set-assoc, 32K I$ with 8k MMU pg sz => 2 aliases - * 2-way-set-assoc, 64K I$ with 8k MMU pg sz => 4 aliases - * 2-way-set-assoc, 16K I$ with 8k MMU pg sz => NO aliases + * However for larger Caches (way-size > page-size) - i.e. in Aliasing config, + * paddr alone could not be used to correctly index the cache. * * ------------------ * MMU v1/v2 (Fixed Page Size 8k) * ------------------ * The solution was to provide CDU with these additonal vaddr bits. These - * would be bits [x:13], x would depend on cache-geom. + * would be bits [x:13], x would depend on cache-geometry, 13 comes from + * standard page size of 8k. * H/w folks chose [17:13] to be a future safe range, and moreso these 5 bits * of vaddr could easily be "stuffed" in the paddr as bits [4:0] since the * orig 5 bits of paddr were anyways ignored by CDU line ops, as they * represent the offset within cache-line. The adv of using this "clumsy" - * interface for additional info was no new reg was needed in CDU. + * interface for additional info was no new reg was needed in CDU programming + * model. * * 17:13 represented the max num of bits passable, actual bits needed were * fewer, based on the num-of-aliases possible. * -for 2 alias possibility, only bit 13 needed (32K cache) * -for 4 alias possibility, bits 14:13 needed (64K cache) * - * Since vaddr was not available for all instances of I$ flush req by core - * kernel, the only safe way (non-optimal though) was to kill all possible - * lines which could represent an alias (even if they didnt represent one - * in execution). - * e.g. for 64K I$, 4 aliases possible, so we did - * flush start - * flush start | 0x01 - * flush start | 0x2 - * flush start | 0x3 - * - * The penalty was invoking the operation itself, since tag match is anyways - * paddr based, a line which didn't represent an alias would not match the - * paddr, hence wont be killed - * - * Note that aliasing concerns are independent of line-sz for a given cache - * geometry (size + set_assoc) because the extra bits required by line-sz are - * reduced from the set calc. - * e.g. 2-way-set-assoc, 32K I$ with 8k MMU pg sz and using math above - * 32b line-sz: 9 bits set-index-calc, 5 bits offset-in-line => 1 extra bit - * 64b line-sz: 8 bits set-index-calc, 6 bits offset-in-line => 1 extra bit - * * ------------------ * MMU v3 * ------------------ - * This ver of MMU supports var page sizes (1k-16k) - Linux will support - * 8k (default), 16k and 4k. + * This ver of MMU supports variable page sizes (1k-16k): although Linux will + * only support 8k (default), 16k and 4k. * However from hardware perspective, smaller page sizes aggrevate aliasing * meaning more vaddr bits needed to disambiguate the cache-line-op ; * the existing scheme of piggybacking won't work for certain configurations. @@ -468,144 +414,53 @@ static inline void __dc_line_op(unsigned long start, unsigned long sz, */ /*********************************************************** - * Machine specific helpers for per line I-Cache invalidate. - * 3 routines to accpunt for 1, 2, 4 aliases possible + * Machine specific helper for per line I-Cache invalidate. */ - -static void __ic_line_inv_no_alias(unsigned long start, int num_lines) -{ - while (num_lines-- > 0) { -#if (CONFIG_ARC_MMU_VER > 2) - write_aux_reg(ARC_REG_IC_PTAG, start); -#endif - write_aux_reg(ARC_REG_IC_IVIL, start); - start += ARC_ICACHE_LINE_LEN; - } -} - -static void __ic_line_inv_2_alias(unsigned long start, int num_lines) -{ - while (num_lines-- > 0) { - -#if (CONFIG_ARC_MMU_VER > 2) - /* - * MMU v3, CDU prog model (for line ops) now uses a new IC_PTAG - * reg to pass the "tag" bits and existing IVIL reg only looks - * at bits relevant for "index" (details above) - * Programming Notes: - * -when writing tag to PTAG reg, bit chopping can be avoided, - * CDU ignores non-tag bits. - * -Ideally "index" must be computed from vaddr, but it is not - * avail in these rtns. So to be safe, we kill the lines in all - * possible indexes corresp to num of aliases possible for - * given cache config. - */ - write_aux_reg(ARC_REG_IC_PTAG, start); - write_aux_reg(ARC_REG_IC_IVIL, - start & ~(0x1 << PAGE_SHIFT)); - write_aux_reg(ARC_REG_IC_IVIL, start | (0x1 << PAGE_SHIFT)); -#else - write_aux_reg(ARC_REG_IC_IVIL, start); - write_aux_reg(ARC_REG_IC_IVIL, start | 0x01); -#endif - start += ARC_ICACHE_LINE_LEN; - } -} - -static void __ic_line_inv_4_alias(unsigned long start, int num_lines) -{ - while (num_lines-- > 0) { - -#if (CONFIG_ARC_MMU_VER > 2) - write_aux_reg(ARC_REG_IC_PTAG, start); - - write_aux_reg(ARC_REG_IC_IVIL, - start & ~(0x3 << PAGE_SHIFT)); - write_aux_reg(ARC_REG_IC_IVIL, - start & ~(0x2 << PAGE_SHIFT)); - write_aux_reg(ARC_REG_IC_IVIL, - start & ~(0x1 << PAGE_SHIFT)); - write_aux_reg(ARC_REG_IC_IVIL, start | (0x3 << PAGE_SHIFT)); -#else - write_aux_reg(ARC_REG_IC_IVIL, start); - write_aux_reg(ARC_REG_IC_IVIL, start | 0x01); - write_aux_reg(ARC_REG_IC_IVIL, start | 0x02); - write_aux_reg(ARC_REG_IC_IVIL, start | 0x03); -#endif - start += ARC_ICACHE_LINE_LEN; - } -} - -static void __ic_line_inv(unsigned long start, unsigned long sz) +static void __ic_line_inv_vaddr(unsigned long paddr, unsigned long vaddr, + unsigned long sz) { unsigned long flags; - int num_lines, slack; + int num_lines; /* - * Ensure we properly floor/ceil the non-line aligned/sized requests - * and have @start - aligned to cache line, and integral @num_lines + * Ensure we properly floor/ceil the non-line aligned/sized requests: * However page sized flushes can be compile time optimised. - * -@start will be cache-line aligned already (being page aligned) + * -@paddr will be cache-line aligned already (being page aligned) * -@sz will be integral multiple of line size (being page sized). */ if (!(__builtin_constant_p(sz) && sz == PAGE_SIZE)) { - slack = start & ~ICACHE_LINE_MASK; - sz += slack; - start -= slack; + sz += paddr & ~ICACHE_LINE_MASK; + paddr &= ICACHE_LINE_MASK; + vaddr &= ICACHE_LINE_MASK; } num_lines = DIV_ROUND_UP(sz, ARC_ICACHE_LINE_LEN); - local_irq_save(flags); - (*___flush_icache_rtn) (start, num_lines); - local_irq_restore(flags); -} - -/* Unlike routines above, having vaddr for flush op (along with paddr), - * prevents the need to speculatively kill the lines in multiple sets - * based on ratio of way_sz : pg_sz - */ -static void __ic_line_inv_vaddr(unsigned long phy_start, - unsigned long vaddr, unsigned long sz) -{ - unsigned long flags; - int num_lines, slack; - unsigned int addr; - - slack = phy_start & ~ICACHE_LINE_MASK; - sz += slack; - phy_start -= slack; - num_lines = DIV_ROUND_UP(sz, ARC_ICACHE_LINE_LEN); - -#if (CONFIG_ARC_MMU_VER > 2) - vaddr &= ~ICACHE_LINE_MASK; - addr = phy_start; -#else +#if (CONFIG_ARC_MMU_VER <= 2) /* bits 17:13 of vaddr go as bits 4:0 of paddr */ - addr = phy_start | ((vaddr >> 13) & 0x1F); + paddr |= (vaddr >> PAGE_SHIFT) & 0x1F; #endif local_irq_save(flags); while (num_lines-- > 0) { #if (CONFIG_ARC_MMU_VER > 2) /* tag comes from phy addr */ - write_aux_reg(ARC_REG_IC_PTAG, addr); + write_aux_reg(ARC_REG_IC_PTAG, paddr); /* index bits come from vaddr */ write_aux_reg(ARC_REG_IC_IVIL, vaddr); vaddr += ARC_ICACHE_LINE_LEN; #else - /* this paddr contains vaddrs bits as needed */ - write_aux_reg(ARC_REG_IC_IVIL, addr); + /* paddr contains stuffed vaddrs bits */ + write_aux_reg(ARC_REG_IC_IVIL, paddr); #endif - addr += ARC_ICACHE_LINE_LEN; + paddr += ARC_ICACHE_LINE_LEN; } local_irq_restore(flags); } #else -#define __ic_line_inv(start, sz) #define __ic_line_inv_vaddr(pstart, vstart, sz) #endif /* CONFIG_ARC_HAS_ICACHE */ @@ -615,35 +470,72 @@ static void __ic_line_inv_vaddr(unsigned long phy_start, * Exported APIs */ -/* TBD: use pg_arch_1 to optimize this */ +/* + * Handle cache congruency of kernel and userspace mappings of page when kernel + * writes-to/reads-from + * + * The idea is to defer flushing of kernel mapping after a WRITE, possible if: + * -dcache is NOT aliasing, hence any U/K-mappings of page are congruent + * -U-mapping doesn't exist yet for page (finalised in update_mmu_cache) + * -In SMP, if hardware caches are coherent + * + * There's a corollary case, where kernel READs from a userspace mapped page. + * If the U-mapping is not congruent to to K-mapping, former needs flushing. + */ void flush_dcache_page(struct page *page) { - __dc_line_op((unsigned long)page_address(page), PAGE_SIZE, OP_FLUSH); + struct address_space *mapping; + + if (!cache_is_vipt_aliasing()) { + set_bit(PG_arch_1, &page->flags); + return; + } + + /* don't handle anon pages here */ + mapping = page_mapping(page); + if (!mapping) + return; + + /* + * pagecache page, file not yet mapped to userspace + * Make a note that K-mapping is dirty + */ + if (!mapping_mapped(mapping)) { + set_bit(PG_arch_1, &page->flags); + } else if (page_mapped(page)) { + + /* kernel reading from page with U-mapping */ + void *paddr = page_address(page); + unsigned long vaddr = page->index << PAGE_CACHE_SHIFT; + + if (addr_not_cache_congruent(paddr, vaddr)) + __flush_dcache_page(paddr, vaddr); + } } EXPORT_SYMBOL(flush_dcache_page); void dma_cache_wback_inv(unsigned long start, unsigned long sz) { - __dc_line_op(start, sz, OP_FLUSH_N_INV); + __dc_line_op_k(start, sz, OP_FLUSH_N_INV); } EXPORT_SYMBOL(dma_cache_wback_inv); void dma_cache_inv(unsigned long start, unsigned long sz) { - __dc_line_op(start, sz, OP_INV); + __dc_line_op_k(start, sz, OP_INV); } EXPORT_SYMBOL(dma_cache_inv); void dma_cache_wback(unsigned long start, unsigned long sz) { - __dc_line_op(start, sz, OP_FLUSH); + __dc_line_op_k(start, sz, OP_FLUSH); } EXPORT_SYMBOL(dma_cache_wback); /* - * This is API for making I/D Caches consistent when modifying code - * (loadable modules, kprobes, etc) + * This is API for making I/D Caches consistent when modifying + * kernel code (loadable modules, kprobes, kgdb...) * This is called on insmod, with kernel virtual address for CODE of * the module. ARC cache maintenance ops require PHY address thus we * need to convert vmalloc addr to PHY addr @@ -652,7 +544,6 @@ void flush_icache_range(unsigned long kstart, unsigned long kend) { unsigned int tot_sz, off, sz; unsigned long phy, pfn; - unsigned long flags; /* printk("Kernel Cache Cohenercy: %lx to %lx\n",kstart, kend); */ @@ -673,8 +564,13 @@ void flush_icache_range(unsigned long kstart, unsigned long kend) /* Case: Kernel Phy addr (0x8000_0000 onwards) */ if (likely(kstart > PAGE_OFFSET)) { - __ic_line_inv(kstart, kend - kstart); - __dc_line_op(kstart, kend - kstart, OP_FLUSH); + /* + * The 2nd arg despite being paddr will be used to index icache + * This is OK since no alternate virtual mappings will exist + * given the callers for this case: kprobe/kgdb in built-in + * kernel code only. + */ + __sync_icache_dcache(kstart, kstart, kend - kstart); return; } @@ -692,42 +588,45 @@ void flush_icache_range(unsigned long kstart, unsigned long kend) pfn = vmalloc_to_pfn((void *)kstart); phy = (pfn << PAGE_SHIFT) + off; sz = min_t(unsigned int, tot_sz, PAGE_SIZE - off); - local_irq_save(flags); - __dc_line_op(phy, sz, OP_FLUSH); - __ic_line_inv(phy, sz); - local_irq_restore(flags); + __sync_icache_dcache(phy, kstart, sz); kstart += sz; tot_sz -= sz; } } /* - * Optimised ver of flush_icache_range() with spec callers: ptrace/signals - * where vaddr is also available. This allows passing both vaddr and paddr - * bits to CDU for cache flush, short-circuting the current pessimistic algo - * which kills all possible aliases. - * An added adv of knowing that vaddr is user-vaddr avoids various checks - * and handling for k-vaddr, k-paddr as done in orig ver above + * General purpose helper to make I and D cache lines consistent. + * @paddr is phy addr of region + * @vaddr is typically user or kernel vaddr (vmalloc) + * Howver in one instance, flush_icache_range() by kprobe (for a breakpt in + * builtin kernel code) @vaddr will be paddr only, meaning CDU operation will + * use a paddr to index the cache (despite VIPT). This is fine since since a + * built-in kernel page will not have any virtual mappings (not even kernel) + * kprobe on loadable module is different as it will have kvaddr. */ -void flush_icache_range_vaddr(unsigned long paddr, unsigned long u_vaddr, - int len) +void __sync_icache_dcache(unsigned long paddr, unsigned long vaddr, int len) { - __ic_line_inv_vaddr(paddr, u_vaddr, len); - __dc_line_op(paddr, len, OP_FLUSH); + unsigned long flags; + + local_irq_save(flags); + __ic_line_inv_vaddr(paddr, vaddr, len); + __dc_line_op(paddr, vaddr, len, OP_FLUSH); + local_irq_restore(flags); +} + +/* wrapper to compile time eliminate alignment checks in flush loop */ +void __inv_icache_page(unsigned long paddr, unsigned long vaddr) +{ + __ic_line_inv_vaddr(paddr, vaddr, PAGE_SIZE); } /* - * XXX: This also needs to be optim using pg_arch_1 - * This is called when a page-cache page is about to be mapped into a - * user process' address space. It offers an opportunity for a - * port to ensure d-cache/i-cache coherency if necessary. + * wrapper to clearout kernel or userspace mappings of a page + * For kernel mappings @vaddr == @paddr */ -void flush_icache_page(struct vm_area_struct *vma, struct page *page) +void ___flush_dcache_page(unsigned long paddr, unsigned long vaddr) { - if (!(vma->vm_flags & VM_EXEC)) - return; - - __ic_line_inv((unsigned long)page_address(page), PAGE_SIZE); + __dc_line_op(paddr, vaddr & PAGE_MASK, PAGE_SIZE, OP_FLUSH_N_INV); } void flush_icache_all(void) @@ -756,6 +655,87 @@ noinline void flush_cache_all(void) } +#ifdef CONFIG_ARC_CACHE_VIPT_ALIASING + +void flush_cache_mm(struct mm_struct *mm) +{ + flush_cache_all(); +} + +void flush_cache_page(struct vm_area_struct *vma, unsigned long u_vaddr, + unsigned long pfn) +{ + unsigned int paddr = pfn << PAGE_SHIFT; + + __sync_icache_dcache(paddr, u_vaddr, PAGE_SIZE); +} + +void flush_cache_range(struct vm_area_struct *vma, unsigned long start, + unsigned long end) +{ + flush_cache_all(); +} + +void copy_user_highpage(struct page *to, struct page *from, + unsigned long u_vaddr, struct vm_area_struct *vma) +{ + void *kfrom = page_address(from); + void *kto = page_address(to); + int clean_src_k_mappings = 0; + + /* + * If SRC page was already mapped in userspace AND it's U-mapping is + * not congruent with K-mapping, sync former to physical page so that + * K-mapping in memcpy below, sees the right data + * + * Note that while @u_vaddr refers to DST page's userspace vaddr, it is + * equally valid for SRC page as well + */ + if (page_mapped(from) && addr_not_cache_congruent(kfrom, u_vaddr)) { + __flush_dcache_page(kfrom, u_vaddr); + clean_src_k_mappings = 1; + } + + copy_page(kto, kfrom); + + /* + * Mark DST page K-mapping as dirty for a later finalization by + * update_mmu_cache(). Although the finalization could have been done + * here as well (given that both vaddr/paddr are available). + * But update_mmu_cache() already has code to do that for other + * non copied user pages (e.g. read faults which wire in pagecache page + * directly). + */ + set_bit(PG_arch_1, &to->flags); + + /* + * if SRC was already usermapped and non-congruent to kernel mapping + * sync the kernel mapping back to physical page + */ + if (clean_src_k_mappings) { + __flush_dcache_page(kfrom, kfrom); + } else { + set_bit(PG_arch_1, &from->flags); + } +} + +void clear_user_page(void *to, unsigned long u_vaddr, struct page *page) +{ + clear_page(to); + set_bit(PG_arch_1, &page->flags); +} + +void flush_anon_page(struct vm_area_struct *vma, struct page *page, + unsigned long u_vaddr) +{ + /* TBD: do we really need to clear the kernel mapping */ + __flush_dcache_page(page_address(page), u_vaddr); + __flush_dcache_page(page_address(page), page_address(page)); + +} + +#endif + /********************************************************************** * Explicit Cache flush request from user space via syscall * Needed for JITs which generate code on the fly diff --git a/arch/arc/mm/extable.c b/arch/arc/mm/extable.c index 014172ba8432..aa652e281324 100644 --- a/arch/arc/mm/extable.c +++ b/arch/arc/mm/extable.c @@ -27,7 +27,7 @@ int fixup_exception(struct pt_regs *regs) #ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE -long arc_copy_from_user_noinline(void *to, const void __user * from, +long arc_copy_from_user_noinline(void *to, const void __user *from, unsigned long n) { return __arc_copy_from_user(to, from, n); @@ -48,7 +48,7 @@ unsigned long arc_clear_user_noinline(void __user *to, } EXPORT_SYMBOL(arc_clear_user_noinline); -long arc_strncpy_from_user_noinline (char *dst, const char __user *src, +long arc_strncpy_from_user_noinline(char *dst, const char __user *src, long count) { return __arc_strncpy_from_user(dst, src, count); diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index af55aab803d2..689ffd86d5e9 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -12,7 +12,6 @@ #include #include #include -#include #include #include #include diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c index 727d4794ea0f..4a177365b2c4 100644 --- a/arch/arc/mm/init.c +++ b/arch/arc/mm/init.c @@ -10,9 +10,6 @@ #include #include #include -#ifdef CONFIG_BLOCK_DEV_RAM -#include -#endif #include #include #include diff --git a/arch/arc/mm/ioremap.c b/arch/arc/mm/ioremap.c index 3e5c92c79936..739e65f355de 100644 --- a/arch/arc/mm/ioremap.c +++ b/arch/arc/mm/ioremap.c @@ -12,7 +12,7 @@ #include #include #include -#include +#include void __iomem *ioremap(unsigned long paddr, unsigned long size) { diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c new file mode 100644 index 000000000000..2e06d56e987b --- /dev/null +++ b/arch/arc/mm/mmap.c @@ -0,0 +1,78 @@ +/* + * ARC700 mmap + * + * (started from arm version - for VIPT alias handling) + * + * Copyright (C) 2013 Synopsys, Inc. (www.synopsys.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include + +#define COLOUR_ALIGN(addr, pgoff) \ + ((((addr) + SHMLBA - 1) & ~(SHMLBA - 1)) + \ + (((pgoff) << PAGE_SHIFT) & (SHMLBA - 1))) + +/* + * Ensure that shared mappings are correctly aligned to + * avoid aliasing issues with VIPT caches. + * We need to ensure that + * a specific page of an object is always mapped at a multiple of + * SHMLBA bytes. + */ +unsigned long +arch_get_unmapped_area(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; + int do_align = 0; + int aliasing = cache_is_vipt_aliasing(); + struct vm_unmapped_area_info info; + + /* + * We only need to do colour alignment if D cache aliases. + */ + if (aliasing) + do_align = filp || (flags & MAP_SHARED); + + /* + * We enforce the MAP_FIXED case. + */ + if (flags & MAP_FIXED) { + if (aliasing && flags & MAP_SHARED && + (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1)) + return -EINVAL; + return addr; + } + + if (len > TASK_SIZE) + return -ENOMEM; + + if (addr) { + if (do_align) + addr = COLOUR_ALIGN(addr, pgoff); + else + addr = PAGE_ALIGN(addr); + + vma = find_vma(mm, addr); + if (TASK_SIZE - len >= addr && + (!vma || addr + len <= vma->vm_start)) + return addr; + } + + info.flags = 0; + info.length = len; + info.low_limit = mm->mmap_base; + info.high_limit = TASK_SIZE; + info.align_mask = do_align ? (PAGE_MASK & (SHMLBA - 1)) : 0; + info.align_offset = pgoff << PAGE_SHIFT; + return vm_unmapped_area(&info); +} diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c index 9b9ce23f4ec3..066145b5f348 100644 --- a/arch/arc/mm/tlb.c +++ b/arch/arc/mm/tlb.c @@ -418,23 +418,52 @@ void create_tlb(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) local_irq_restore(flags); } -/* arch hook called by core VM at the end of handle_mm_fault( ), - * when a new PTE is entered in Page Tables or an existing one - * is modified. We aggresively pre-install a TLB entry +/* + * Called at the end of pagefault, for a userspace mapped page + * -pre-install the corresponding TLB entry into MMU + * -Finalize the delayed D-cache flush of kernel mapping of page due to + * flush_dcache_page(), copy_user_page() + * + * Note that flush (when done) involves both WBACK - so physical page is + * in sync as well as INV - so any non-congruent aliases don't remain */ - -void update_mmu_cache(struct vm_area_struct *vma, unsigned long vaddress, +void update_mmu_cache(struct vm_area_struct *vma, unsigned long vaddr_unaligned, pte_t *ptep) { + unsigned long vaddr = vaddr_unaligned & PAGE_MASK; + unsigned long paddr = pte_val(*ptep) & PAGE_MASK; + + create_tlb(vma, vaddr, ptep); + + /* + * Exec page : Independent of aliasing/page-color considerations, + * since icache doesn't snoop dcache on ARC, any dirty + * K-mapping of a code page needs to be wback+inv so that + * icache fetch by userspace sees code correctly. + * !EXEC page: If K-mapping is NOT congruent to U-mapping, flush it + * so userspace sees the right data. + * (Avoids the flush for Non-exec + congruent mapping case) + */ + if (vma->vm_flags & VM_EXEC || addr_not_cache_congruent(paddr, vaddr)) { + struct page *page = pfn_to_page(pte_pfn(*ptep)); + + int dirty = test_and_clear_bit(PG_arch_1, &page->flags); + if (dirty) { + /* wback + inv dcache lines */ + __flush_dcache_page(paddr, paddr); - create_tlb(vma, vaddress, ptep); + /* invalidate any existing icache lines */ + if (vma->vm_flags & VM_EXEC) + __inv_icache_page(paddr, vaddr); + } + } } /* Read the Cache Build Confuration Registers, Decode them and save into * the cpuinfo structure for later use. * No Validation is done here, simply read/convert the BCRs */ -void __init read_decode_mmu_bcr(void) +void __cpuinit read_decode_mmu_bcr(void) { unsigned int tmp; struct bcr_mmu_1_2 *mmu2; /* encoded MMU2 attr */ @@ -466,7 +495,7 @@ void __init read_decode_mmu_bcr(void) char *arc_mmu_mumbojumbo(int cpu_id, char *buf, int len) { int n = 0; - struct cpuinfo_arc_mmu *p_mmu = &cpuinfo_arc700[smp_processor_id()].mmu; + struct cpuinfo_arc_mmu *p_mmu = &cpuinfo_arc700[cpu_id].mmu; n += scnprintf(buf + n, len - n, "ARC700 MMU [v%x]\t: %dk PAGE, ", p_mmu->ver, TO_KB(p_mmu->pg_sz)); @@ -480,7 +509,7 @@ char *arc_mmu_mumbojumbo(int cpu_id, char *buf, int len) return buf; } -void __init arc_mmu_init(void) +void __cpuinit arc_mmu_init(void) { char str[256]; struct cpuinfo_arc_mmu *mmu = &cpuinfo_arc700[smp_processor_id()].mmu; diff --git a/arch/arc/plat-arcfpga/platform.c b/arch/arc/plat-arcfpga/platform.c index 4e20a1a5104d..b3700c064c06 100644 --- a/arch/arc/plat-arcfpga/platform.c +++ b/arch/arc/plat-arcfpga/platform.c @@ -224,3 +224,15 @@ MACHINE_START(ML509, "ml509") .init_smp = iss_model_init_smp, #endif MACHINE_END + +static const char *nsimosci_compat[] __initdata = { + "snps,nsimosci", + NULL, +}; + +MACHINE_START(NSIMOSCI, "nsimosci") + .dt_compat = nsimosci_compat, + .init_early = NULL, + .init_machine = plat_fpga_populate_dev, + .init_irq = NULL, +MACHINE_END diff --git a/arch/arc/plat-tb10x/Kconfig b/arch/arc/plat-tb10x/Kconfig new file mode 100644 index 000000000000..1d3452100f1f --- /dev/null +++ b/arch/arc/plat-tb10x/Kconfig @@ -0,0 +1,29 @@ +# Abilis Systems TB10x platform kernel configuration file +# +# Author: Christian Ruppert +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License version 2 as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + + +menuconfig ARC_PLAT_TB10X + bool "Abilis TB10x" + select COMMON_CLK + select PINCTRL + select PINMUX + select ARCH_REQUIRE_GPIOLIB + help + Support for platforms based on the TB10x home media gateway SOC by + Abilis Systems. TB10x is based on the ARC700 CPU architecture. + Say Y if you are building a kernel for one of the SOCs in this + series (e.g. TB100 or TB101). If in doubt say N. diff --git a/arch/arc/plat-tb10x/Makefile b/arch/arc/plat-tb10x/Makefile new file mode 100644 index 000000000000..89611d25ef35 --- /dev/null +++ b/arch/arc/plat-tb10x/Makefile @@ -0,0 +1,21 @@ +# Abilis Systems TB10x platform Makefile +# +# Author: Christian Ruppert +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License version 2 as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + + +KBUILD_CFLAGS += -Iarch/arc/plat-tb10x/include + +obj-y += tb10x.o diff --git a/arch/arc/plat-tb10x/tb10x.c b/arch/arc/plat-tb10x/tb10x.c new file mode 100644 index 000000000000..d3567691c7e1 --- /dev/null +++ b/arch/arc/plat-tb10x/tb10x.c @@ -0,0 +1,71 @@ +/* + * Abilis Systems TB10x platform initialisation + * + * Copyright (C) Abilis Systems 2012 + * + * Author: Christian Ruppert + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + + +#include +#include +#include +#include + +#include + + +static void __init tb10x_platform_init(void) +{ + of_clk_init(NULL); + of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL); +} + +static void __init tb10x_platform_late_init(void) +{ + struct device_node *dn; + + /* + * Pinctrl documentation recommends setting up the iomux here for + * all modules which don't require control over the pins themselves. + * Modules which need this kind of assistance are compatible with + * "abilis,simple-pinctrl", i.e. we can easily iterate over them. + * TODO: Does this recommended method work cleanly with pins required + * by modules? + */ + for_each_compatible_node(dn, NULL, "abilis,simple-pinctrl") { + struct platform_device *pd = of_find_device_by_node(dn); + struct pinctrl *pctl; + + pctl = pinctrl_get_select(&pd->dev, "abilis,simple-default"); + if (IS_ERR(pctl)) { + int ret = PTR_ERR(pctl); + dev_err(&pd->dev, "Could not set up pinctrl: %d\n", + ret); + } + } +} + +static const char *tb10x_compat[] __initdata = { + "abilis,arc-tb10x", + NULL, +}; + +MACHINE_START(TB10x, "tb10x") + .dt_compat = tb10x_compat, + .init_machine = tb10x_platform_init, + .init_late = tb10x_platform_late_init, +MACHINE_END diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index aa71a2321040..d423d58f938d 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -109,9 +109,6 @@ config MIGHT_HAVE_PCI config SYS_SUPPORTS_APM_EMULATION bool -config GENERIC_GPIO - bool - config HAVE_TCM bool select GENERIC_ALLOCATOR @@ -900,7 +897,6 @@ config ARCH_MULTI_V7 bool "ARMv7 based platforms (Cortex-A, PJ4, Scorpion, Krait)" default y select ARCH_MULTI_V6_V7 - select ARCH_VEXPRESS select CPU_V7 config ARCH_MULTI_V6_V7 diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi index d1101103aa51..1460d9b88adf 100644 --- a/arch/arm/boot/dts/am33xx.dtsi +++ b/arch/arm/boot/dts/am33xx.dtsi @@ -403,5 +403,17 @@ 0x44d80000 0x2000>; /* M3 DMEM */ ti,hwmods = "wkup_m3"; }; + + gpmc: gpmc@50000000 { + compatible = "ti,am3352-gpmc"; + ti,hwmods = "gpmc"; + reg = <0x50000000 0x2000>; + interrupts = <100>; + num-cs = <7>; + num-waitpins = <2>; + #address-cells = <2>; + #size-cells = <1>; + status = "disabled"; + }; }; }; diff --git a/arch/arm/boot/dts/cros5250-common.dtsi b/arch/arm/boot/dts/cros5250-common.dtsi index 0a61bbb9102f..3f0239ec1bc5 100644 --- a/arch/arm/boot/dts/cros5250-common.dtsi +++ b/arch/arm/boot/dts/cros5250-common.dtsi @@ -175,6 +175,14 @@ i2c@12C70000 { samsung,i2c-sda-delay = <100>; samsung,i2c-max-bus-freq = <378000>; + + trackpad { + reg = <0x67>; + compatible = "cypress,cyapa"; + interrupts = <2 0>; + interrupt-parent = <&gpx1>; + wakeup-source; + }; }; i2c@12C80000 { diff --git a/arch/arm/boot/dts/exynos5250-smdk5250.dts b/arch/arm/boot/dts/exynos5250-smdk5250.dts index 26d856ba50a1..3e0c792e2767 100644 --- a/arch/arm/boot/dts/exynos5250-smdk5250.dts +++ b/arch/arm/boot/dts/exynos5250-smdk5250.dts @@ -214,7 +214,7 @@ }; usb@12110000 { - samsung,vbus-gpio = <&gpx2 6 1 3 3>; + samsung,vbus-gpio = <&gpx2 6 0>; }; dp-controller { diff --git a/arch/arm/boot/dts/exynos5250-snow.dts b/arch/arm/boot/dts/exynos5250-snow.dts index bf4744bab445..d449feb7e143 100644 --- a/arch/arm/boot/dts/exynos5250-snow.dts +++ b/arch/arm/boot/dts/exynos5250-snow.dts @@ -183,7 +183,7 @@ }; usb@12110000 { - samsung,vbus-gpio = <&gpx1 1 1 3 3>; + samsung,vbus-gpio = <&gpx1 1 0>; }; fixed-rate-clocks { diff --git a/arch/arm/boot/dts/omap3-beagle-xm.dts b/arch/arm/boot/dts/omap3-beagle-xm.dts index 5a31964ae339..3046d1f81be0 100644 --- a/arch/arm/boot/dts/omap3-beagle-xm.dts +++ b/arch/arm/boot/dts/omap3-beagle-xm.dts @@ -122,6 +122,7 @@ &usb_otg_hs { interface-type = <0>; + usb-phy = <&usb2_phy>; mode = <3>; power = <50>; }; diff --git a/arch/arm/boot/dts/omap3-evm.dts b/arch/arm/boot/dts/omap3-evm.dts index 05f51e10ddd6..96d1c206a57b 100644 --- a/arch/arm/boot/dts/omap3-evm.dts +++ b/arch/arm/boot/dts/omap3-evm.dts @@ -68,6 +68,7 @@ &usb_otg_hs { interface-type = <0>; + usb-phy = <&usb2_phy>; mode = <3>; power = <50>; }; diff --git a/arch/arm/boot/dts/omap3-overo.dtsi b/arch/arm/boot/dts/omap3-overo.dtsi index d4a7280d18b7..a626c50041f6 100644 --- a/arch/arm/boot/dts/omap3-overo.dtsi +++ b/arch/arm/boot/dts/omap3-overo.dtsi @@ -73,6 +73,7 @@ &usb_otg_hs { interface-type = <0>; + usb-phy = <&usb2_phy>; mode = <3>; power = <50>; }; diff --git a/arch/arm/boot/dts/omap3.dtsi b/arch/arm/boot/dts/omap3.dtsi index 4ad03d9dbf0c..82a404da1c0d 100644 --- a/arch/arm/boot/dts/omap3.dtsi +++ b/arch/arm/boot/dts/omap3.dtsi @@ -519,7 +519,6 @@ interrupts = <0 92 0x4>, <0 93 0x4>; interrupt-names = "mc", "dma"; ti,hwmods = "usb_otg_hs"; - usb-phy = <&usb2_phy>; multipoint = <1>; num-eps = <16>; ram-bits = <12>; diff --git a/arch/arm/boot/dts/omap36xx.dtsi b/arch/arm/boot/dts/omap36xx.dtsi index b89233e43b0f..f3447bc1b032 100644 --- a/arch/arm/boot/dts/omap36xx.dtsi +++ b/arch/arm/boot/dts/omap36xx.dtsi @@ -20,9 +20,9 @@ cpu@0 { operating-points = < /* kHz uV */ - 300000 975000 - 600000 1075000 - 800000 1200000 + 300000 1012500 + 600000 1200000 + 800000 1325000 >; clock-latency = <300000>; /* From legacy driver */ }; diff --git a/arch/arm/boot/dts/omap4-sdp.dts b/arch/arm/boot/dts/omap4-sdp.dts index c387bdc1b1d1..a35d9cd58063 100644 --- a/arch/arm/boot/dts/omap4-sdp.dts +++ b/arch/arm/boot/dts/omap4-sdp.dts @@ -223,6 +223,15 @@ >; }; + mcspi1_pins: pinmux_mcspi1_pins { + pinctrl-single,pins = < + 0xf2 0x100 /* mcspi1_clk.mcspi1_clk INPUT | MODE0 */ + 0xf4 0x100 /* mcspi1_somi.mcspi1_somi INPUT | MODE0 */ + 0xf6 0x100 /* mcspi1_simo.mcspi1_simo INPUT | MODE0 */ + 0xf8 0x100 /* mcspi1_cs0.mcspi1_cs0 INPUT | MODE0*/ + >; + }; + dss_hdmi_pins: pinmux_dss_hdmi_pins { pinctrl-single,pins = < 0x5a 0x118 /* hdmi_cec.hdmi_cec INPUT PULLUP | MODE 0 */ @@ -358,12 +367,15 @@ }; &mcspi1 { + pinctrl-names = "default"; + pinctrl-0 = <&mcspi1_pins>; + eth@0 { compatible = "ks8851"; spi-max-frequency = <24000000>; reg = <0>; interrupt-parent = <&gpio2>; - interrupts = <2>; /* gpio line 34 */ + interrupts = <2 8>; /* gpio line 34, low triggered */ vdd-supply = <&vdd_eth>; }; }; diff --git a/arch/arm/boot/dts/omap4-var-som.dts b/arch/arm/boot/dts/omap4-var-som.dts index 222a413c2c51..7e04103779c4 100644 --- a/arch/arm/boot/dts/omap4-var-som.dts +++ b/arch/arm/boot/dts/omap4-var-som.dts @@ -68,7 +68,7 @@ spi-max-frequency = <24000000>; reg = <0>; interrupt-parent = <&gpio6>; - interrupts = <11>; /* gpio line 171 */ + interrupts = <11 8>; /* gpio line 171, low triggered */ vdd-supply = <&vdd_eth>; }; }; diff --git a/arch/arm/boot/dts/omap4460.dtsi b/arch/arm/boot/dts/omap4460.dtsi index 7c2c23cc17ef..2cf227c86099 100644 --- a/arch/arm/boot/dts/omap4460.dtsi +++ b/arch/arm/boot/dts/omap4460.dtsi @@ -15,9 +15,9 @@ cpu@0 { operating-points = < /* kHz uV */ - 350000 975000 - 700000 1075000 - 920000 1200000 + 350000 1025000 + 700000 1200000 + 920000 1313000 >; clock-latency = <300000>; /* From legacy driver */ }; diff --git a/arch/arm/configs/omap2plus_defconfig b/arch/arm/configs/omap2plus_defconfig index 33903ca0d879..c1ef64bc5abd 100644 --- a/arch/arm/configs/omap2plus_defconfig +++ b/arch/arm/configs/omap2plus_defconfig @@ -137,6 +137,8 @@ CONFIG_SERIAL_8250_DETECT_IRQ=y CONFIG_SERIAL_8250_RSA=y CONFIG_SERIAL_AMBA_PL011=y CONFIG_SERIAL_AMBA_PL011_CONSOLE=y +CONFIG_SERIAL_OMAP=y +CONFIG_SERIAL_OMAP_CONSOLE=y CONFIG_HW_RANDOM=y CONFIG_I2C_CHARDEV=y CONFIG_SPI=y @@ -153,6 +155,7 @@ CONFIG_OMAP_WATCHDOG=y CONFIG_TWL4030_WATCHDOG=y CONFIG_MFD_TPS65217=y CONFIG_MFD_TPS65910=y +CONFIG_TWL6040_CORE=y CONFIG_REGULATOR_TWL4030=y CONFIG_REGULATOR_TPS65023=y CONFIG_REGULATOR_TPS6507X=y @@ -195,6 +198,7 @@ CONFIG_SND_USB_AUDIO=m CONFIG_SND_SOC=m CONFIG_SND_OMAP_SOC=m CONFIG_SND_OMAP_SOC_OMAP_TWL4030=m +CONFIG_SND_OMAP_SOC_OMAP_ABE_TWL6040=m CONFIG_SND_OMAP_SOC_OMAP3_PANDORA=m CONFIG_USB=y CONFIG_USB_DEBUG=y diff --git a/arch/arm/kernel/devtree.c b/arch/arm/kernel/devtree.c index 70f1bdeb241b..5af04f6daa33 100644 --- a/arch/arm/kernel/devtree.c +++ b/arch/arm/kernel/devtree.c @@ -180,6 +180,13 @@ struct machine_desc * __init setup_machine_fdt(unsigned int dt_phys) unsigned long dt_root; const char *model; +#ifdef CONFIG_ARCH_MULTIPLATFORM + DT_MACHINE_START(GENERIC_DT, "Generic DT based system") + MACHINE_END + + mdesc_best = (struct machine_desc *)&__mach_desc_GENERIC_DT; +#endif + if (!dt_phys) return NULL; diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 728007c4a2b7..1522c7ae31b0 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -659,9 +660,19 @@ struct screen_info screen_info = { static int __init customize_machine(void) { - /* customizes platform devices, or adds new ones */ + /* + * customizes platform devices, or adds new ones + * On DT based machines, we fall back to populating the + * machine from the device tree, if no callback is provided, + * otherwise we would always need an init_machine callback. + */ if (machine_desc->init_machine) machine_desc->init_machine(); +#ifdef CONFIG_OF + else + of_platform_populate(NULL, of_default_bus_match_table, + NULL, NULL); +#endif return 0; } arch_initcall(customize_machine); diff --git a/arch/arm/mach-exynos/include/mach/regs-pmu.h b/arch/arm/mach-exynos/include/mach/regs-pmu.h index 3f30aa1ae354..57344b7e98ce 100644 --- a/arch/arm/mach-exynos/include/mach/regs-pmu.h +++ b/arch/arm/mach-exynos/include/mach/regs-pmu.h @@ -344,6 +344,7 @@ #define EXYNOS5_FSYS_ARM_OPTION S5P_PMUREG(0x2208) #define EXYNOS5_ISP_ARM_OPTION S5P_PMUREG(0x2288) #define EXYNOS5_ARM_COMMON_OPTION S5P_PMUREG(0x2408) +#define EXYNOS5_ARM_L2_OPTION S5P_PMUREG(0x2608) #define EXYNOS5_TOP_PWR_OPTION S5P_PMUREG(0x2C48) #define EXYNOS5_TOP_PWR_SYSMEM_OPTION S5P_PMUREG(0x2CC8) #define EXYNOS5_JPEG_MEM_OPTION S5P_PMUREG(0x2F48) diff --git a/arch/arm/mach-exynos/pmu.c b/arch/arm/mach-exynos/pmu.c index daebc1abc966..97d688526258 100644 --- a/arch/arm/mach-exynos/pmu.c +++ b/arch/arm/mach-exynos/pmu.c @@ -228,6 +228,7 @@ static struct exynos_pmu_conf exynos5250_pmu_config[] = { { EXYNOS5_DIS_IRQ_ISP_ARM_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} }, { EXYNOS5_ARM_COMMON_SYS_PWR_REG, { 0x0, 0x0, 0x2} }, { EXYNOS5_ARM_L2_SYS_PWR_REG, { 0x3, 0x3, 0x3} }, + { EXYNOS5_ARM_L2_OPTION, { 0x10, 0x10, 0x0 } }, { EXYNOS5_CMU_ACLKSTOP_SYS_PWR_REG, { 0x1, 0x0, 0x1} }, { EXYNOS5_CMU_SCLKSTOP_SYS_PWR_REG, { 0x1, 0x0, 0x1} }, { EXYNOS5_CMU_RESET_SYS_PWR_REG, { 0x1, 0x1, 0x0} }, @@ -353,11 +354,9 @@ static void exynos5_init_pmu(void) /* * SKIP_DEACTIVATE_ACEACP_IN_PWDN_BITFIELD Enable - * MANUAL_L2RSTDISABLE_CONTROL_BITFIELD Enable */ tmp = __raw_readl(EXYNOS5_ARM_COMMON_OPTION); - tmp |= (EXYNOS5_MANUAL_L2RSTDISABLE_CONTROL | - EXYNOS5_SKIP_DEACTIVATE_ACEACP_IN_PWDN); + tmp |= EXYNOS5_SKIP_DEACTIVATE_ACEACP_IN_PWDN; __raw_writel(tmp, EXYNOS5_ARM_COMMON_OPTION); /* diff --git a/arch/arm/mach-imx/Kconfig b/arch/arm/mach-imx/Kconfig index 78f795d73cb6..ba44328464f3 100644 --- a/arch/arm/mach-imx/Kconfig +++ b/arch/arm/mach-imx/Kconfig @@ -5,6 +5,7 @@ config ARCH_MXC select AUTO_ZRELADDR if !ZBOOT_ROM select CLKDEV_LOOKUP select CLKSRC_MMIO + select GENERIC_ALLOCATOR select GENERIC_CLOCKEVENTS select GENERIC_IRQ_CHIP select MULTI_IRQ_HANDLER @@ -61,10 +62,6 @@ config MXC_ULPI config ARCH_HAS_RNGA bool -config IRAM_ALLOC - bool - select GENERIC_ALLOCATOR - config HAVE_IMX_ANATOP bool diff --git a/arch/arm/mach-imx/Makefile b/arch/arm/mach-imx/Makefile index 930958973f81..70ae7c490ac0 100644 --- a/arch/arm/mach-imx/Makefile +++ b/arch/arm/mach-imx/Makefile @@ -23,7 +23,6 @@ obj-$(CONFIG_ARCH_MXC_IOMUX_V3) += iomux-v3.o obj-$(CONFIG_MXC_TZIC) += tzic.o obj-$(CONFIG_MXC_AVIC) += avic.o -obj-$(CONFIG_IRAM_ALLOC) += iram_alloc.o obj-$(CONFIG_MXC_ULPI) += ulpi.o obj-$(CONFIG_MXC_USE_EPIT) += epit.o obj-$(CONFIG_MXC_DEBUG_BOARD) += 3ds_debugboard.o diff --git a/arch/arm/mach-imx/common.h b/arch/arm/mach-imx/common.h index 4cba7dbb079f..c08ae3f99cee 100644 --- a/arch/arm/mach-imx/common.h +++ b/arch/arm/mach-imx/common.h @@ -12,6 +12,7 @@ #define __ASM_ARCH_MXC_COMMON_H__ struct platform_device; +struct pt_regs; struct clk; enum mxc_cpu_pwr_mode; diff --git a/arch/arm/mach-imx/headsmp.S b/arch/arm/mach-imx/headsmp.S index a58c8b0527cc..67b9c48dcafe 100644 --- a/arch/arm/mach-imx/headsmp.S +++ b/arch/arm/mach-imx/headsmp.S @@ -24,7 +24,7 @@ ENTRY(v7_secondary_startup) ENDPROC(v7_secondary_startup) #endif -#ifdef CONFIG_PM +#ifdef CONFIG_ARM_CPU_SUSPEND /* * The following code must assume it is running from physical address * where absolute virtual addresses to the data section have to be diff --git a/arch/arm/mach-imx/hotplug.c b/arch/arm/mach-imx/hotplug.c index 5e91112dcbee..3daf1ed90579 100644 --- a/arch/arm/mach-imx/hotplug.c +++ b/arch/arm/mach-imx/hotplug.c @@ -11,7 +11,9 @@ */ #include +#include #include +#include #include "common.h" diff --git a/arch/arm/mach-imx/iram_alloc.c b/arch/arm/mach-imx/iram_alloc.c deleted file mode 100644 index e05cf407db65..000000000000 --- a/arch/arm/mach-imx/iram_alloc.c +++ /dev/null @@ -1,73 +0,0 @@ -/* - * Copyright (C) 2010 Freescale Semiconductor, Inc. All Rights Reserved. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version 2 - * of the License, or (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, - * MA 02110-1301, USA. - */ - -#include -#include -#include -#include -#include -#include "linux/platform_data/imx-iram.h" - -static unsigned long iram_phys_base; -static void __iomem *iram_virt_base; -static struct gen_pool *iram_pool; - -static inline void __iomem *iram_phys_to_virt(unsigned long p) -{ - return iram_virt_base + (p - iram_phys_base); -} - -void __iomem *iram_alloc(unsigned int size, unsigned long *dma_addr) -{ - if (!iram_pool) - return NULL; - - *dma_addr = gen_pool_alloc(iram_pool, size); - pr_debug("iram alloc - %dB@0x%lX\n", size, *dma_addr); - if (!*dma_addr) - return NULL; - return iram_phys_to_virt(*dma_addr); -} -EXPORT_SYMBOL(iram_alloc); - -void iram_free(unsigned long addr, unsigned int size) -{ - if (!iram_pool) - return; - - gen_pool_free(iram_pool, addr, size); -} -EXPORT_SYMBOL(iram_free); - -int __init iram_init(unsigned long base, unsigned long size) -{ - iram_phys_base = base; - - iram_pool = gen_pool_create(PAGE_SHIFT, -1); - if (!iram_pool) - return -ENOMEM; - - gen_pool_add(iram_pool, base, size, -1); - iram_virt_base = ioremap(iram_phys_base, size); - if (!iram_virt_base) - return -EIO; - - pr_debug("i.MX IRAM pool: %ld KB@0x%p\n", size / 1024, iram_virt_base); - return 0; -} diff --git a/arch/arm/mach-omap1/dma.c b/arch/arm/mach-omap1/dma.c index 1a4e887f028d..68ab858e27b7 100644 --- a/arch/arm/mach-omap1/dma.c +++ b/arch/arm/mach-omap1/dma.c @@ -301,7 +301,7 @@ static int __init omap1_system_dma_init(void) if (ret) { dev_err(&pdev->dev, "%s: Unable to add resources for %s%d\n", __func__, pdev->name, pdev->id); - goto exit_device_put; + goto exit_iounmap; } p = kzalloc(sizeof(struct omap_system_dma_plat_info), GFP_KERNEL); @@ -309,7 +309,7 @@ static int __init omap1_system_dma_init(void) dev_err(&pdev->dev, "%s: Unable to allocate 'p' for %s\n", __func__, pdev->name); ret = -ENOMEM; - goto exit_device_del; + goto exit_iounmap; } d = kzalloc(sizeof(struct omap_dma_dev_attr), GFP_KERNEL); @@ -402,8 +402,8 @@ exit_release_d: kfree(d); exit_release_p: kfree(p); -exit_device_del: - platform_device_del(pdev); +exit_iounmap: + iounmap(dma_base); exit_device_put: platform_device_put(pdev); diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig index 857b1f097fd8..f49cd51e162a 100644 --- a/arch/arm/mach-omap2/Kconfig +++ b/arch/arm/mach-omap2/Kconfig @@ -37,8 +37,6 @@ config ARCH_OMAP2PLUS_TYPICAL select NEON if ARCH_OMAP3 || ARCH_OMAP4 || SOC_OMAP5 select PM_RUNTIME select REGULATOR - select SERIAL_OMAP - select SERIAL_OMAP_CONSOLE select TWL4030_CORE if ARCH_OMAP3 || ARCH_OMAP4 select TWL4030_POWER if ARCH_OMAP3 || ARCH_OMAP4 select VFP diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile index 62bb352c2d37..55a9d6777683 100644 --- a/arch/arm/mach-omap2/Makefile +++ b/arch/arm/mach-omap2/Makefile @@ -32,12 +32,12 @@ obj-$(CONFIG_SOC_HAS_OMAP2_SDRC) += sdrc.o # SMP support ONLY available for OMAP4 -obj-$(CONFIG_SMP) += omap-smp.o omap-headsmp.o -obj-$(CONFIG_HOTPLUG_CPU) += omap-hotplug.o +smp-$(CONFIG_SMP) += omap-smp.o omap-headsmp.o +smp-$(CONFIG_HOTPLUG_CPU) += omap-hotplug.o omap-4-5-common = omap4-common.o omap-wakeupgen.o \ sleep44xx.o -obj-$(CONFIG_ARCH_OMAP4) += $(omap-4-5-common) -obj-$(CONFIG_SOC_OMAP5) += $(omap-4-5-common) +obj-$(CONFIG_ARCH_OMAP4) += $(omap-4-5-common) $(smp-y) +obj-$(CONFIG_SOC_OMAP5) += $(omap-4-5-common) $(smp-y) plus_sec := $(call as-instr,.arch_extension sec,+sec) AFLAGS_omap-headsmp.o :=-Wa,-march=armv7-a$(plus_sec) diff --git a/arch/arm/mach-omap2/board-omap3beagle.c b/arch/arm/mach-omap2/board-omap3beagle.c index 6de78605c0af..04c116555412 100644 --- a/arch/arm/mach-omap2/board-omap3beagle.c +++ b/arch/arm/mach-omap2/board-omap3beagle.c @@ -112,13 +112,13 @@ static u8 omap3_beagle_version; */ static struct { int mmc1_gpio_wp; - int usb_pwr_level; + bool usb_pwr_level; /* 0 - Active Low, 1 - Active High */ int dvi_pd_gpio; int usr_button_gpio; int mmc_caps; } beagle_config = { .mmc1_gpio_wp = -EINVAL, - .usb_pwr_level = GPIOF_OUT_INIT_LOW, + .usb_pwr_level = 0, .dvi_pd_gpio = -EINVAL, .usr_button_gpio = 4, .mmc_caps = MMC_CAP_4_BIT_DATA | MMC_CAP_8_BIT_DATA, @@ -178,7 +178,7 @@ static void __init omap3_beagle_init_rev(void) case 0: printk(KERN_INFO "OMAP3 Beagle Rev: xM Ax/Bx\n"); omap3_beagle_version = OMAP3BEAGLE_BOARD_XM; - beagle_config.usb_pwr_level = GPIOF_OUT_INIT_HIGH; + beagle_config.usb_pwr_level = 1; beagle_config.mmc_caps &= ~MMC_CAP_8_BIT_DATA; break; case 2: diff --git a/arch/arm/mach-omap2/board-rx51-peripherals.c b/arch/arm/mach-omap2/board-rx51-peripherals.c index 1a884670a6c4..18ca61e300b3 100644 --- a/arch/arm/mach-omap2/board-rx51-peripherals.c +++ b/arch/arm/mach-omap2/board-rx51-peripherals.c @@ -73,11 +73,11 @@ #define LIS302_IRQ1_GPIO 181 #define LIS302_IRQ2_GPIO 180 /* Not yet in use */ -/* list all spi devices here */ +/* List all SPI devices here. Note that the list/probe order seems to matter! */ enum { RX51_SPI_WL1251, - RX51_SPI_MIPID, /* LCD panel */ RX51_SPI_TSC2005, /* Touch Controller */ + RX51_SPI_MIPID, /* LCD panel */ }; static struct wl12xx_platform_data wl1251_pdata; diff --git a/arch/arm/mach-omap2/dma.c b/arch/arm/mach-omap2/dma.c index dab9fc014b97..49fd0d501c9b 100644 --- a/arch/arm/mach-omap2/dma.c +++ b/arch/arm/mach-omap2/dma.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include "soc.h" @@ -304,6 +305,9 @@ static int __init omap2_system_dma_init(void) if (res) return res; + if (of_have_populated_dt()) + return res; + pdev = platform_device_register_full(&omap_dma_dev_info); if (IS_ERR(pdev)) return PTR_ERR(pdev); diff --git a/arch/arm/mach-omap2/gpmc.c b/arch/arm/mach-omap2/gpmc.c index ed946df5ad8a..6c4da1254f53 100644 --- a/arch/arm/mach-omap2/gpmc.c +++ b/arch/arm/mach-omap2/gpmc.c @@ -1520,36 +1520,22 @@ static int gpmc_probe_dt(struct platform_device *pdev) return ret; } - for_each_node_by_name(child, "nand") { - ret = gpmc_probe_nand_child(pdev, child); - if (ret < 0) { - of_node_put(child); - return ret; - } - } + for_each_child_of_node(pdev->dev.of_node, child) { - for_each_node_by_name(child, "onenand") { - ret = gpmc_probe_onenand_child(pdev, child); - if (ret < 0) { - of_node_put(child); - return ret; - } - } + if (!child->name) + continue; - for_each_node_by_name(child, "nor") { - ret = gpmc_probe_generic_child(pdev, child); - if (ret < 0) { - of_node_put(child); - return ret; - } - } + if (of_node_cmp(child->name, "nand") == 0) + ret = gpmc_probe_nand_child(pdev, child); + else if (of_node_cmp(child->name, "onenand") == 0) + ret = gpmc_probe_onenand_child(pdev, child); + else if (of_node_cmp(child->name, "ethernet") == 0 || + of_node_cmp(child->name, "nor") == 0) + ret = gpmc_probe_generic_child(pdev, child); - for_each_node_by_name(child, "ethernet") { - ret = gpmc_probe_generic_child(pdev, child); - if (ret < 0) { + if (WARN(ret < 0, "%s: probing gpmc child %s failed\n", + __func__, child->full_name)) of_node_put(child); - return ret; - } } return 0; diff --git a/arch/arm/mach-omap2/id.c b/arch/arm/mach-omap2/id.c index 0f4c18e6e60c..1272c41d4749 100644 --- a/arch/arm/mach-omap2/id.c +++ b/arch/arm/mach-omap2/id.c @@ -419,11 +419,15 @@ void __init omap3xxx_check_revision(void) cpu_rev = "1.0"; break; case 1: - /* FALLTHROUGH */ - default: omap_revision = AM335X_REV_ES2_0; cpu_rev = "2.0"; break; + case 2: + /* FALLTHROUGH */ + default: + omap_revision = AM335X_REV_ES2_1; + cpu_rev = "2.1"; + break; } break; case 0xb8f2: @@ -644,13 +648,12 @@ void __init omap_soc_device_init(void) soc_dev_attr->revision = soc_rev; soc_dev = soc_device_register(soc_dev_attr); - if (IS_ERR_OR_NULL(soc_dev)) { + if (IS_ERR(soc_dev)) { kfree(soc_dev_attr); return; } parent = soc_device_to_device(soc_dev); - if (!IS_ERR_OR_NULL(parent)) - device_create_file(parent, &omap_soc_attr); + device_create_file(parent, &omap_soc_attr); } #endif /* CONFIG_SOC_BUS */ diff --git a/arch/arm/mach-omap2/mux34xx.h b/arch/arm/mach-omap2/mux34xx.h index 6543ebf8ecfc..3f26d297c082 100644 --- a/arch/arm/mach-omap2/mux34xx.h +++ b/arch/arm/mach-omap2/mux34xx.h @@ -393,6 +393,10 @@ #define OMAP3_CONTROL_PADCONF_SAD2D_SWAKEUP_OFFSET 0xa1c #define OMAP3_CONTROL_PADCONF_JTAG_RTCK_OFFSET 0xa1e #define OMAP3_CONTROL_PADCONF_JTAG_TDO_OFFSET 0xa20 +#define OMAP3_CONTROL_PADCONF_GPIO_127 0xa24 +#define OMAP3_CONTROL_PADCONF_GPIO_126 0xa26 +#define OMAP3_CONTROL_PADCONF_GPIO_128 0xa28 +#define OMAP3_CONTROL_PADCONF_GPIO_129 0xa2a #define OMAP3_CONTROL_PADCONF_MUX_SIZE \ - (OMAP3_CONTROL_PADCONF_JTAG_TDO_OFFSET + 0x2) + (OMAP3_CONTROL_PADCONF_GPIO_129 + 0x2) diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c index eeea4fa28fbc..e6d230700b2b 100644 --- a/arch/arm/mach-omap2/omap_device.c +++ b/arch/arm/mach-omap2/omap_device.c @@ -876,4 +876,4 @@ static int __init omap_device_late_init(void) bus_for_each_dev(&platform_bus_type, NULL, NULL, omap_device_late_idle); return 0; } -omap_late_initcall(omap_device_late_init); +omap_late_initcall_sync(omap_device_late_init); diff --git a/arch/arm/mach-omap2/soc.h b/arch/arm/mach-omap2/soc.h index 18fdeeb3a44a..197cc16870d9 100644 --- a/arch/arm/mach-omap2/soc.h +++ b/arch/arm/mach-omap2/soc.h @@ -396,6 +396,7 @@ IS_OMAP_TYPE(3430, 0x3430) #define AM335X_CLASS 0x33500033 #define AM335X_REV_ES1_0 AM335X_CLASS #define AM335X_REV_ES2_0 (AM335X_CLASS | (0x1 << 8)) +#define AM335X_REV_ES2_1 (AM335X_CLASS | (0x2 << 8)) #define OMAP443X_CLASS 0x44300044 #define OMAP4430_REV_ES1_0 (OMAP443X_CLASS | (0x10 << 8)) @@ -496,6 +497,7 @@ level(__##fn); #define omap_subsys_initcall(fn) omap_initcall(subsys_initcall, fn) #define omap_device_initcall(fn) omap_initcall(device_initcall, fn) #define omap_late_initcall(fn) omap_initcall(late_initcall, fn) +#define omap_late_initcall_sync(fn) omap_initcall(late_initcall_sync, fn) #endif /* __ASSEMBLY__ */ diff --git a/arch/arm/mach-prima2/Kconfig b/arch/arm/mach-prima2/Kconfig index 80ca974b2f82..6988b117fc17 100644 --- a/arch/arm/mach-prima2/Kconfig +++ b/arch/arm/mach-prima2/Kconfig @@ -38,7 +38,7 @@ config ARCH_MARCO select CPU_V7 select HAVE_ARM_SCU if SMP select HAVE_SMP - select SMP_ON_UP + select SMP_ON_UP if SMP help Support for CSR SiRFSoC ARM Cortex A9 Platform diff --git a/arch/arm/mach-pxa/Kconfig b/arch/arm/mach-pxa/Kconfig index 9075461999c1..96100dbf5a2e 100644 --- a/arch/arm/mach-pxa/Kconfig +++ b/arch/arm/mach-pxa/Kconfig @@ -162,7 +162,6 @@ config MACH_XCEP select MTD select MTD_CFI select MTD_CFI_INTELEXT - select MTD_CHAR select MTD_PHYSMAP select PXA25x select SMC91X diff --git a/arch/arm/mach-spear/spear13xx.c b/arch/arm/mach-spear/spear13xx.c index 3621599c38ad..7aa6e8cf830f 100644 --- a/arch/arm/mach-spear/spear13xx.c +++ b/arch/arm/mach-spear/spear13xx.c @@ -35,6 +35,8 @@ void __init spear13xx_l2x0_init(void) * write alloc and 'Full line of zero' options * */ + if (!IS_ENABLED(CONFIG_CACHE_L2X0)) + return; writel_relaxed(0x06, VA_L2CC_BASE + L2X0_PREFETCH_CTRL); diff --git a/arch/arm/mach-tegra/Kconfig b/arch/arm/mach-tegra/Kconfig index 20c3b372cdf5..84d72fc36dfe 100644 --- a/arch/arm/mach-tegra/Kconfig +++ b/arch/arm/mach-tegra/Kconfig @@ -63,6 +63,7 @@ config ARCH_TEGRA_114_SOC select ARM_ARCH_TIMER select ARM_GIC select ARM_L1_CACHE_SHIFT_6 + select CPU_FREQ_TABLE if CPU_FREQ select CPU_V7 select PINCTRL select PINCTRL_TEGRA114 diff --git a/arch/arm/mach-ux500/Kconfig b/arch/arm/mach-ux500/Kconfig index f66d7deae46d..6a4387e39df8 100644 --- a/arch/arm/mach-ux500/Kconfig +++ b/arch/arm/mach-ux500/Kconfig @@ -19,6 +19,8 @@ if ARCH_U8500 config UX500_SOC_COMMON bool default y + select ABX500_CORE + select AB8500_CORE select ARM_ERRATA_754322 select ARM_ERRATA_764369 if SMP select ARM_GIC diff --git a/arch/arm/mach-ux500/board-mop500.c b/arch/arm/mach-ux500/board-mop500.c index a15dd6b63a8f..3cd555ac6d0a 100644 --- a/arch/arm/mach-ux500/board-mop500.c +++ b/arch/arm/mach-ux500/board-mop500.c @@ -403,8 +403,8 @@ static int mop500_prox_activate(struct device *dev) "no regulator\n"); return PTR_ERR(prox_regulator); } - regulator_enable(prox_regulator); - return 0; + + return regulator_enable(prox_regulator); } static void mop500_prox_deactivate(struct device *dev) diff --git a/arch/arm/mach-ux500/cpu-db8500.c b/arch/arm/mach-ux500/cpu-db8500.c index 995928ba22fd..e90b5ab23b6d 100644 --- a/arch/arm/mach-ux500/cpu-db8500.c +++ b/arch/arm/mach-ux500/cpu-db8500.c @@ -191,7 +191,7 @@ static const char *db8500_read_soc_id(void) /* Throw these device-specific numbers into the entropy pool */ add_device_randomness(uid, 0x14); return kasprintf(GFP_KERNEL, "%08x%08x%08x%08x%08x", - readl((u32 *)uid+1), + readl((u32 *)uid+0), readl((u32 *)uid+1), readl((u32 *)uid+2), readl((u32 *)uid+3), readl((u32 *)uid+4)); } diff --git a/arch/arm/plat-orion/Makefile b/arch/arm/plat-orion/Makefile index 2eca54b65906..9433605cd290 100644 --- a/arch/arm/plat-orion/Makefile +++ b/arch/arm/plat-orion/Makefile @@ -3,6 +3,6 @@ # ccflags-$(CONFIG_ARCH_MULTIPLATFORM) := -I$(srctree)/$(src)/include -orion-gpio-$(CONFIG_GENERIC_GPIO) += gpio.o +orion-gpio-$(CONFIG_GPIOLIB) += gpio.o obj-$(CONFIG_PLAT_ORION_LEGACY) += irq.o pcie.o time.o common.o mpp.o obj-$(CONFIG_PLAT_ORION_LEGACY) += $(orion-gpio-y) diff --git a/arch/arm/plat-orion/gpio.c b/arch/arm/plat-orion/gpio.c index e39c2ba6e2fb..249fe6333e18 100644 --- a/arch/arm/plat-orion/gpio.c +++ b/arch/arm/plat-orion/gpio.c @@ -150,7 +150,7 @@ err_out: } /* - * GENERIC_GPIO primitives. + * GPIO primitives. */ static int orion_gpio_request(struct gpio_chip *chip, unsigned pin) { diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 73b6e764034c..48347dcf0566 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -6,6 +6,7 @@ config ARM64 select ARCH_WANT_FRAME_POINTERS select ARM_AMBA select ARM_ARCH_TIMER + select ARM_GIC select CLONE_BACKWARDS select COMMON_CLK select GENERIC_CLOCKEVENTS @@ -31,6 +32,8 @@ config ARM64 select OF select OF_EARLY_FLATTREE select PERF_USE_VMALLOC + select POWER_RESET + select POWER_SUPPLY select RTC_LIB select SPARSE_IRQ select SYSCTL_EXCEPTION_TRACE @@ -92,9 +95,6 @@ config SWIOTLB config IOMMU_HELPER def_bool SWIOTLB -config GENERIC_GPIO - bool - source "init/Kconfig" source "kernel/Kconfig.freezer" @@ -105,6 +105,7 @@ config ARCH_VEXPRESS bool "ARMv8 software model (Versatile Express)" select ARCH_REQUIRE_GPIOLIB select COMMON_CLK_VERSATILE + select POWER_RESET_VEXPRESS select VEXPRESS_CONFIG help This enables support for the ARMv8 software model (Versatile diff --git a/arch/arm64/boot/dts/foundation-v8.dts b/arch/arm64/boot/dts/foundation-v8.dts index 198682b6de31..84fcc5018284 100644 --- a/arch/arm64/boot/dts/foundation-v8.dts +++ b/arch/arm64/boot/dts/foundation-v8.dts @@ -23,7 +23,7 @@ }; cpus { - #address-cells = <1>; + #address-cells = <2>; #size-cells = <0>; cpu@0 { diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h index 95e407255347..a6e1750369ef 100644 --- a/arch/arm64/include/asm/system_misc.h +++ b/arch/arm64/include/asm/system_misc.h @@ -41,7 +41,7 @@ extern void show_pte(struct mm_struct *mm, unsigned long addr); extern void __show_regs(struct pt_regs *); void soft_restart(unsigned long); -extern void (*pm_restart)(const char *cmd); +extern void (*arm_pm_restart)(char str, const char *cmd); #define UDBG_UNDEFINED (1 << 0) #define UDBG_SYSCALL (1 << 1) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index f4919721f7dd..46f02c3b5015 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -81,8 +81,8 @@ void soft_restart(unsigned long addr) void (*pm_power_off)(void); EXPORT_SYMBOL_GPL(pm_power_off); -void (*pm_restart)(const char *cmd); -EXPORT_SYMBOL_GPL(pm_restart); +void (*arm_pm_restart)(char str, const char *cmd); +EXPORT_SYMBOL_GPL(arm_pm_restart); void arch_cpu_idle_prepare(void) { @@ -131,8 +131,8 @@ void machine_restart(char *cmd) local_fiq_disable(); /* Now call the architecture specific reboot code. */ - if (pm_restart) - pm_restart(cmd); + if (arm_pm_restart) + arm_pm_restart('h', cmd); /* * Whoops - the architecture was unable to reboot. diff --git a/arch/arm64/lib/bitops.S b/arch/arm64/lib/bitops.S index 36216d30cb9a..e5db797790d3 100644 --- a/arch/arm64/lib/bitops.S +++ b/arch/arm64/lib/bitops.S @@ -21,13 +21,13 @@ /* * x0: bits 5:0 bit offset - * bits 63:6 word offset + * bits 31:6 word offset * x1: address */ .macro bitop, name, instr ENTRY( \name ) - and x3, x0, #63 // Get bit offset - eor x0, x0, x3 // Clear low bits + and w3, w0, #63 // Get bit offset + eor w0, w0, w3 // Clear low bits mov x2, #1 add x1, x1, x0, lsr #3 // Get word offset lsl x3, x2, x3 // Create mask @@ -41,8 +41,8 @@ ENDPROC(\name ) .macro testop, name, instr ENTRY( \name ) - and x3, x0, #63 // Get bit offset - eor x0, x0, x3 // Clear low bits + and w3, w0, #63 // Get bit offset + eor w0, w0, w3 // Clear low bits mov x2, #1 add x1, x1, x0, lsr #3 // Get word offset lsl x4, x2, x3 // Create mask diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 52638171d6fd..98af6e760cce 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -148,6 +148,7 @@ void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *regs) #define VM_FAULT_BADACCESS 0x020000 #define ESR_WRITE (1 << 6) +#define ESR_CM (1 << 8) #define ESR_LNX_EXEC (1 << 24) /* @@ -206,7 +207,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, struct task_struct *tsk; struct mm_struct *mm; int fault, sig, code; - int write = esr & ESR_WRITE; + bool write = (esr & ESR_WRITE) && !(esr & ESR_CM); unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE | (write ? FAULT_FLAG_WRITE : 0); diff --git a/arch/avr32/Kconfig b/arch/avr32/Kconfig index 22c40308360b..bdc35589277f 100644 --- a/arch/avr32/Kconfig +++ b/arch/avr32/Kconfig @@ -26,9 +26,6 @@ config AVR32 There is an AVR32 Linux project with a web page at http://avr32linux.org/. -config GENERIC_GPIO - def_bool y - config STACKTRACE_SUPPORT def_bool y diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig index 453ebe46b065..a117652b5fea 100644 --- a/arch/blackfin/Kconfig +++ b/arch/blackfin/Kconfig @@ -27,7 +27,7 @@ config BLACKFIN select HAVE_OPROFILE select HAVE_PERF_EVENTS select ARCH_HAVE_CUSTOM_GPIO_H - select ARCH_WANT_OPTIONAL_GPIOLIB + select ARCH_REQUIRE_GPIOLIB select HAVE_UID16 select HAVE_UNDERSCORE_SYMBOL_PREFIX select VIRT_TO_BUS @@ -52,9 +52,6 @@ config GENERIC_BUG config ZONE_DMA def_bool y -config GENERIC_GPIO - def_bool y - config FORCE_MAX_ZONEORDER int default "14" diff --git a/arch/blackfin/Makefile b/arch/blackfin/Makefile index 66cf00095b84..1fce08632ad7 100644 --- a/arch/blackfin/Makefile +++ b/arch/blackfin/Makefile @@ -141,11 +141,11 @@ archclean: INSTALL_PATH ?= /tftpboot boot := arch/$(ARCH)/boot -BOOT_TARGETS = vmImage vmImage.bin vmImage.bz2 vmImage.gz vmImage.lzma vmImage.lzo vmImage.xip +BOOT_TARGETS = uImage uImage.bin uImage.bz2 uImage.gz uImage.lzma uImage.lzo uImage.xip PHONY += $(BOOT_TARGETS) install -KBUILD_IMAGE := $(boot)/vmImage +KBUILD_IMAGE := $(boot)/uImage -all: vmImage +all: uImage $(BOOT_TARGETS): vmlinux $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@ diff --git a/arch/blackfin/boot/Makefile b/arch/blackfin/boot/Makefile index f7d27d50d02c..3efaa094fb90 100644 --- a/arch/blackfin/boot/Makefile +++ b/arch/blackfin/boot/Makefile @@ -6,7 +6,7 @@ # for more details. # -targets := vmImage vmImage.bin vmImage.bz2 vmImage.gz vmImage.lzma vmImage.lzo vmImage.xip +targets := uImage uImage.bin uImage.bz2 uImage.gz uImage.lzma uImage.lzo uImage.xip extra-y += vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma vmlinux.bin.lzo vmlinux.bin.xip ifeq ($(CONFIG_RAMKERNEL),y) @@ -39,22 +39,22 @@ quiet_cmd_mk_bin_xip = BIN $@ $(obj)/vmlinux.bin.xip: $(obj)/vmlinux.bin FORCE $(call if_changed,mk_bin_xip) -$(obj)/vmImage.bin: $(obj)/vmlinux.bin +$(obj)/uImage.bin: $(obj)/vmlinux.bin $(call if_changed,uimage,none) -$(obj)/vmImage.bz2: $(obj)/vmlinux.bin.bz2 +$(obj)/uImage.bz2: $(obj)/vmlinux.bin.bz2 $(call if_changed,uimage,bzip2) -$(obj)/vmImage.gz: $(obj)/vmlinux.bin.gz +$(obj)/uImage.gz: $(obj)/vmlinux.bin.gz $(call if_changed,uimage,gzip) -$(obj)/vmImage.lzma: $(obj)/vmlinux.bin.lzma +$(obj)/uImage.lzma: $(obj)/vmlinux.bin.lzma $(call if_changed,uimage,lzma) -$(obj)/vmImage.lzo: $(obj)/vmlinux.bin.lzo +$(obj)/uImage.lzo: $(obj)/vmlinux.bin.lzo $(call if_changed,uimage,lzo) -$(obj)/vmImage.xip: $(obj)/vmlinux.bin.xip +$(obj)/uImage.xip: $(obj)/vmlinux.bin.xip $(call if_changed,uimage,none) suffix-y := bin @@ -64,7 +64,7 @@ suffix-$(CONFIG_KERNEL_LZMA) := lzma suffix-$(CONFIG_KERNEL_LZO) := lzo suffix-$(CONFIG_ROMKERNEL) := xip -$(obj)/vmImage: $(obj)/vmImage.$(suffix-y) +$(obj)/uImage: $(obj)/uImage.$(suffix-y) @ln -sf $(notdir $<) $@ install: diff --git a/arch/blackfin/include/asm/atomic.h b/arch/blackfin/include/asm/atomic.h index c8db653c72d2..a107a98e9978 100644 --- a/arch/blackfin/include/asm/atomic.h +++ b/arch/blackfin/include/asm/atomic.h @@ -11,7 +11,9 @@ #ifdef CONFIG_SMP +#include #include +#include asmlinkage int __raw_uncached_fetch_asm(const volatile int *ptr); asmlinkage int __raw_atomic_update_asm(volatile int *ptr, int value); diff --git a/arch/blackfin/include/asm/bfin_sdh.h b/arch/blackfin/include/asm/bfin_sdh.h index 6a4cfe2d3367..a99957ea9e9b 100644 --- a/arch/blackfin/include/asm/bfin_sdh.h +++ b/arch/blackfin/include/asm/bfin_sdh.h @@ -24,18 +24,27 @@ struct bfin_sd_host { #define CMD_INT_E (1 << 8) /* Command Interrupt */ #define CMD_PEND_E (1 << 9) /* Command Pending */ #define CMD_E (1 << 10) /* Command Enable */ +#ifdef RSI_BLKSZ +#define CMD_CRC_CHECK_D (1 << 11) /* CRC Check is disabled */ +#define CMD_DATA0_BUSY (1 << 12) /* Check for Busy State on the DATA0 pin */ +#endif /* SDH_PWR_CTL bitmasks */ +#ifndef RSI_BLKSZ #define PWR_ON 0x3 /* Power On */ #define SD_CMD_OD (1 << 6) /* Open Drain Output */ #define ROD_CTL (1 << 7) /* Rod Control */ +#endif /* SDH_CLK_CTL bitmasks */ #define CLKDIV 0xff /* MC_CLK Divisor */ #define CLK_E (1 << 8) /* MC_CLK Bus Clock Enable */ #define PWR_SV_E (1 << 9) /* Power Save Enable */ #define CLKDIV_BYPASS (1 << 10) /* Bypass Divisor */ -#define WIDE_BUS (1 << 11) /* Wide Bus Mode Enable */ +#define BUS_MODE_MASK 0x1800 /* Bus Mode Mask */ +#define STD_BUS_1 0x000 /* Standard Bus 1 bit mode */ +#define WIDE_BUS_4 0x800 /* Wide Bus 4 bit mode */ +#define BYTE_BUS_8 0x1000 /* Byte Bus 8 bit mode */ /* SDH_RESP_CMD bitmasks */ #define RESP_CMD 0x3f /* Response Command */ @@ -45,7 +54,13 @@ struct bfin_sd_host { #define DTX_DIR (1 << 1) /* Data Transfer Direction */ #define DTX_MODE (1 << 2) /* Data Transfer Mode */ #define DTX_DMA_E (1 << 3) /* Data Transfer DMA Enable */ +#ifndef RSI_BLKSZ #define DTX_BLK_LGTH (0xf << 4) /* Data Transfer Block Length */ +#else + +/* Bit masks for SDH_BLK_SIZE */ +#define DTX_BLK_LGTH 0x1fff /* Data Transfer Block Length */ +#endif /* SDH_STATUS bitmasks */ #define CMD_CRC_FAIL (1 << 0) /* CMD CRC Fail */ @@ -114,10 +129,14 @@ struct bfin_sd_host { /* SDH_E_STATUS bitmasks */ #define SDIO_INT_DET (1 << 1) /* SDIO Int Detected */ #define SD_CARD_DET (1 << 4) /* SD Card Detect */ +#define SD_CARD_BUSYMODE (1 << 31) /* Card is in Busy mode */ +#define SD_CARD_SLPMODE (1 << 30) /* Card in Sleep Mode */ +#define SD_CARD_READY (1 << 17) /* Card Ready */ /* SDH_E_MASK bitmasks */ #define SDIO_MSK (1 << 1) /* Mask SDIO Int Detected */ -#define SCD_MSK (1 << 6) /* Mask Card Detect */ +#define SCD_MSK (1 << 4) /* Mask Card Detect */ +#define CARD_READY_MSK (1 << 16) /* Mask Card Ready */ /* SDH_CFG bitmasks */ #define CLKS_EN (1 << 0) /* Clocks Enable */ @@ -126,7 +145,15 @@ struct bfin_sd_host { #define SD_RST (1 << 4) /* SDMMC Reset */ #define PUP_SDDAT (1 << 5) /* Pull-up SD_DAT */ #define PUP_SDDAT3 (1 << 6) /* Pull-up SD_DAT3 */ +#ifndef RSI_BLKSZ #define PD_SDDAT3 (1 << 7) /* Pull-down SD_DAT3 */ +#else +#define PWR_ON 0x600 /* Power On */ +#define SD_CMD_OD (1 << 11) /* Open Drain Output */ +#define BOOT_EN (1 << 12) /* Boot Enable */ +#define BOOT_MODE (1 << 13) /* Alternate Boot Mode */ +#define BOOT_ACK_EN (1 << 14) /* Boot ACK is expected */ +#endif /* SDH_RD_WAIT_EN bitmasks */ #define RWR (1 << 0) /* Read Wait Request */ diff --git a/arch/blackfin/include/asm/bitops.h b/arch/blackfin/include/asm/bitops.h index 8a0fed16058f..0ca40dd44724 100644 --- a/arch/blackfin/include/asm/bitops.h +++ b/arch/blackfin/include/asm/bitops.h @@ -41,6 +41,7 @@ #include #else +#include #include /* swab32 */ #include diff --git a/arch/blackfin/include/asm/def_LPBlackfin.h b/arch/blackfin/include/asm/def_LPBlackfin.h index fe0ca03a1cb2..ca67145c6a45 100644 --- a/arch/blackfin/include/asm/def_LPBlackfin.h +++ b/arch/blackfin/include/asm/def_LPBlackfin.h @@ -622,10 +622,12 @@ do { \ #define PAGE_SIZE_4KB 0x00010000 /* 4 KB page size */ #define PAGE_SIZE_1MB 0x00020000 /* 1 MB page size */ #define PAGE_SIZE_4MB 0x00030000 /* 4 MB page size */ +#ifdef CONFIG_BF60x #define PAGE_SIZE_16KB 0x00040000 /* 16 KB page size */ #define PAGE_SIZE_64KB 0x00050000 /* 64 KB page size */ #define PAGE_SIZE_16MB 0x00060000 /* 16 MB page size */ #define PAGE_SIZE_64MB 0x00070000 /* 64 MB page size */ +#endif #define CPLB_L1SRAM 0x00000020 /* 0=SRAM mapped in L1, 0=SRAM not * mapped to L1 */ diff --git a/arch/blackfin/include/asm/mem_init.h b/arch/blackfin/include/asm/mem_init.h index 9b33e7247864..c865b33eeb68 100644 --- a/arch/blackfin/include/asm/mem_init.h +++ b/arch/blackfin/include/asm/mem_init.h @@ -335,6 +335,7 @@ struct ddr_config { u32 ddr_clk; u32 dmc_ddrctl; + u32 dmc_effctl; u32 dmc_ddrcfg; u32 dmc_ddrtr0; u32 dmc_ddrtr1; @@ -348,6 +349,7 @@ static struct ddr_config ddr_config_table[] __attribute__((section(".data_l1"))) [0] = { .ddr_clk = 125, .dmc_ddrctl = 0x00000904, + .dmc_effctl = 0x004400C0, .dmc_ddrcfg = 0x00000422, .dmc_ddrtr0 = 0x20705212, .dmc_ddrtr1 = 0x201003CF, @@ -358,6 +360,7 @@ static struct ddr_config ddr_config_table[] __attribute__((section(".data_l1"))) [1] = { .ddr_clk = 133, .dmc_ddrctl = 0x00000904, + .dmc_effctl = 0x004400C0, .dmc_ddrcfg = 0x00000422, .dmc_ddrtr0 = 0x20806313, .dmc_ddrtr1 = 0x2013040D, @@ -368,6 +371,7 @@ static struct ddr_config ddr_config_table[] __attribute__((section(".data_l1"))) [2] = { .ddr_clk = 150, .dmc_ddrctl = 0x00000904, + .dmc_effctl = 0x004400C0, .dmc_ddrcfg = 0x00000422, .dmc_ddrtr0 = 0x20A07323, .dmc_ddrtr1 = 0x20160492, @@ -378,6 +382,7 @@ static struct ddr_config ddr_config_table[] __attribute__((section(".data_l1"))) [3] = { .ddr_clk = 166, .dmc_ddrctl = 0x00000904, + .dmc_effctl = 0x004400C0, .dmc_ddrcfg = 0x00000422, .dmc_ddrtr0 = 0x20A07323, .dmc_ddrtr1 = 0x2016050E, @@ -388,6 +393,7 @@ static struct ddr_config ddr_config_table[] __attribute__((section(".data_l1"))) [4] = { .ddr_clk = 200, .dmc_ddrctl = 0x00000904, + .dmc_effctl = 0x004400C0, .dmc_ddrcfg = 0x00000422, .dmc_ddrtr0 = 0x20a07323, .dmc_ddrtr1 = 0x2016050f, @@ -398,6 +404,7 @@ static struct ddr_config ddr_config_table[] __attribute__((section(".data_l1"))) [5] = { .ddr_clk = 225, .dmc_ddrctl = 0x00000904, + .dmc_effctl = 0x004400C0, .dmc_ddrcfg = 0x00000422, .dmc_ddrtr0 = 0x20E0A424, .dmc_ddrtr1 = 0x302006DB, @@ -408,6 +415,7 @@ static struct ddr_config ddr_config_table[] __attribute__((section(".data_l1"))) [6] = { .ddr_clk = 250, .dmc_ddrctl = 0x00000904, + .dmc_effctl = 0x004400C0, .dmc_ddrcfg = 0x00000422, .dmc_ddrtr0 = 0x20E0A424, .dmc_ddrtr1 = 0x3020079E, @@ -469,6 +477,7 @@ static inline void init_dmc(u32 dmc_clk) bfin_write_DMC0_TR2(ddr_config_table[i].dmc_ddrtr2); bfin_write_DMC0_MR(ddr_config_table[i].dmc_ddrmr); bfin_write_DMC0_EMR1(ddr_config_table[i].dmc_ddrmr1); + bfin_write_DMC0_EFFCTL(ddr_config_table[i].dmc_effctl); bfin_write_DMC0_CTL(ddr_config_table[i].dmc_ddrctl); break; } diff --git a/arch/blackfin/kernel/cplb-nompu/cplbinit.c b/arch/blackfin/kernel/cplb-nompu/cplbinit.c index 34e96ce02aa9..b49a53b583d5 100644 --- a/arch/blackfin/kernel/cplb-nompu/cplbinit.c +++ b/arch/blackfin/kernel/cplb-nompu/cplbinit.c @@ -30,6 +30,7 @@ void __init generate_cplb_tables_cpu(unsigned int cpu) { int i_d, i_i; unsigned long addr; + unsigned long cplb_pageflags, cplb_pagesize; struct cplb_entry *d_tbl = dcplb_tbl[cpu]; struct cplb_entry *i_tbl = icplb_tbl[cpu]; @@ -49,11 +50,20 @@ void __init generate_cplb_tables_cpu(unsigned int cpu) /* Cover kernel memory with 4M pages. */ addr = 0; - for (; addr < memory_start; addr += 4 * 1024 * 1024) { +#ifdef PAGE_SIZE_16MB + cplb_pageflags = PAGE_SIZE_16MB; + cplb_pagesize = SIZE_16M; +#else + cplb_pageflags = PAGE_SIZE_4MB; + cplb_pagesize = SIZE_4M; +#endif + + + for (; addr < memory_start; addr += cplb_pagesize) { d_tbl[i_d].addr = addr; - d_tbl[i_d++].data = SDRAM_DGENERIC | PAGE_SIZE_4MB; + d_tbl[i_d++].data = SDRAM_DGENERIC | cplb_pageflags; i_tbl[i_i].addr = addr; - i_tbl[i_i++].data = SDRAM_IGENERIC | PAGE_SIZE_4MB; + i_tbl[i_i++].data = SDRAM_IGENERIC | cplb_pageflags; } #ifdef CONFIG_ROMKERNEL diff --git a/arch/blackfin/kernel/cplb-nompu/cplbmgr.c b/arch/blackfin/kernel/cplb-nompu/cplbmgr.c index e854f9066cbd..79cc0f6dcdd5 100644 --- a/arch/blackfin/kernel/cplb-nompu/cplbmgr.c +++ b/arch/blackfin/kernel/cplb-nompu/cplbmgr.c @@ -145,7 +145,7 @@ MGR_ATTR static int dcplb_miss(int cpu) unsigned long addr = bfin_read_DCPLB_FAULT_ADDR(); int status = bfin_read_DCPLB_STATUS(); int idx; - unsigned long d_data, base, addr1, eaddr; + unsigned long d_data, base, addr1, eaddr, cplb_pagesize, cplb_pageflags; nr_dcplb_miss[cpu]++; if (unlikely(status & FAULT_USERSUPV)) @@ -167,18 +167,37 @@ MGR_ATTR static int dcplb_miss(int cpu) if (unlikely(d_data == 0)) return CPLB_NO_ADDR_MATCH; - addr1 = addr & ~(SIZE_4M - 1); addr &= ~(SIZE_1M - 1); d_data |= PAGE_SIZE_1MB; - if (addr1 >= base && (addr1 + SIZE_4M) <= eaddr) { + + /* BF60x support large than 4M CPLB page size */ +#ifdef PAGE_SIZE_16MB + cplb_pageflags = PAGE_SIZE_16MB; + cplb_pagesize = SIZE_16M; +#else + cplb_pageflags = PAGE_SIZE_4MB; + cplb_pagesize = SIZE_4M; +#endif + +find_pagesize: + addr1 = addr & ~(cplb_pagesize - 1); + if (addr1 >= base && (addr1 + cplb_pagesize) <= eaddr) { /* * This works because * (PAGE_SIZE_4MB & PAGE_SIZE_1MB) == PAGE_SIZE_1MB. */ - d_data |= PAGE_SIZE_4MB; + d_data |= cplb_pageflags; addr = addr1; + goto found_pagesize; + } else { + if (cplb_pagesize > SIZE_4M) { + cplb_pageflags = PAGE_SIZE_4MB; + cplb_pagesize = SIZE_4M; + goto find_pagesize; + } } +found_pagesize: #ifdef CONFIG_BF60x if ((addr >= ASYNC_BANK0_BASE) && (addr < ASYNC_BANK3_BASE + ASYNC_BANK3_SIZE)) diff --git a/arch/blackfin/kernel/cplbinfo.c b/arch/blackfin/kernel/cplbinfo.c index 404045dcc5e4..5b80d59e66e5 100644 --- a/arch/blackfin/kernel/cplbinfo.c +++ b/arch/blackfin/kernel/cplbinfo.c @@ -17,8 +17,13 @@ #include #include -static char const page_strtbl[][3] = { "1K", "4K", "1M", "4M" }; -#define page(flags) (((flags) & 0x30000) >> 16) +static char const page_strtbl[][4] = { + "1K", "4K", "1M", "4M", +#ifdef CONFIG_BF60x + "16K", "64K", "16M", "64M", +#endif +}; +#define page(flags) (((flags) & 0x70000) >> 16) #define strpage(flags) page_strtbl[page(flags)] struct cplbinfo_data { diff --git a/arch/blackfin/kernel/setup.c b/arch/blackfin/kernel/setup.c index fb96e607adcf..107b306b06f1 100644 --- a/arch/blackfin/kernel/setup.c +++ b/arch/blackfin/kernel/setup.c @@ -1314,7 +1314,7 @@ static int show_cpuinfo(struct seq_file *m, void *v) seq_printf(m, "(Compiled for Rev %d)", bfin_compiled_revid()); } - seq_printf(m, "\ncpu MHz\t\t: %lu.%03lu/%lu.%03lu\n", + seq_printf(m, "\ncpu MHz\t\t: %lu.%06lu/%lu.%06lu\n", cclk/1000000, cclk%1000000, sclk/1000000, sclk%1000000); seq_printf(m, "bogomips\t: %lu.%02lu\n" diff --git a/arch/blackfin/mach-bf537/boards/stamp.c b/arch/blackfin/mach-bf537/boards/stamp.c index 95114ed395ac..6a3a14bcd3a1 100644 --- a/arch/blackfin/mach-bf537/boards/stamp.c +++ b/arch/blackfin/mach-bf537/boards/stamp.c @@ -455,6 +455,7 @@ static struct platform_device bfin_async_nand_device = { static void bfin_plat_nand_init(void) { gpio_request(BFIN_NAND_PLAT_READY, "bfin_nand_plat"); + gpio_direction_input(BFIN_NAND_PLAT_READY); } #else static void bfin_plat_nand_init(void) {} diff --git a/arch/blackfin/mach-bf538/boards/ezkit.c b/arch/blackfin/mach-bf538/boards/ezkit.c index a4fce0370c1d..755f0dc12010 100644 --- a/arch/blackfin/mach-bf538/boards/ezkit.c +++ b/arch/blackfin/mach-bf538/boards/ezkit.c @@ -764,7 +764,6 @@ static struct platform_device i2c_bfin_twi1_device = { .num_resources = ARRAY_SIZE(bfin_twi1_resource), .resource = bfin_twi1_resource, }; -#endif /* CONFIG_BF542 */ #endif /* CONFIG_I2C_BLACKFIN_TWI */ #if defined(CONFIG_KEYBOARD_GPIO) || defined(CONFIG_KEYBOARD_GPIO_MODULE) diff --git a/arch/blackfin/mach-bf609/include/mach/cdefBF60x_base.h b/arch/blackfin/mach-bf609/include/mach/cdefBF60x_base.h index 4954cf3f7e16..102ee4025ac9 100644 --- a/arch/blackfin/mach-bf609/include/mach/cdefBF60x_base.h +++ b/arch/blackfin/mach-bf609/include/mach/cdefBF60x_base.h @@ -312,6 +312,8 @@ #define bfin_write_DMC0_EMR1(val) bfin_write32(DMC0_EMR1, val) #define bfin_read_DMC0_CTL() bfin_read32(DMC0_CTL) #define bfin_write_DMC0_CTL(val) bfin_write32(DMC0_CTL, val) +#define bfin_read_DMC0_EFFCTL() bfin_read32(DMC0_EFFCTL) +#define bfin_write_DMC0_EFFCTL(val) bfin_write32(DMC0_EFFCTL, val) #define bfin_read_DMC0_STAT() bfin_read32(DMC0_STAT) #define bfin_write_DMC0_STAT(val) bfin_write32(DMC0_STAT, val) #define bfin_read_DMC0_DLLCTL() bfin_read32(DMC0_DLLCTL) diff --git a/arch/cris/Kconfig b/arch/cris/Kconfig index 06dd026533e3..8769a9045a54 100644 --- a/arch/cris/Kconfig +++ b/arch/cris/Kconfig @@ -264,7 +264,6 @@ config ETRAX_AXISFLASHMAP select MTD_CFI select MTD_CFI_AMDSTD select MTD_JEDECPROBE if ETRAX_ARCH_V32 - select MTD_CHAR select MTD_BLOCK select MTD_COMPLEX_MAPPINGS help diff --git a/arch/cris/arch-v32/drivers/Kconfig b/arch/cris/arch-v32/drivers/Kconfig index af4a486dadcd..c55971a40c34 100644 --- a/arch/cris/arch-v32/drivers/Kconfig +++ b/arch/cris/arch-v32/drivers/Kconfig @@ -404,7 +404,6 @@ config ETRAX_AXISFLASHMAP select MTD_CFI select MTD_CFI_AMDSTD select MTD_JEDECPROBE - select MTD_CHAR select MTD_BLOCK select MTD_COMPLEX_MAPPINGS help diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig index 04dff5bdcbf7..33a97929d055 100644 --- a/arch/hexagon/Kconfig +++ b/arch/hexagon/Kconfig @@ -30,8 +30,6 @@ config HEXAGON select GENERIC_CLOCKEVENTS_BROADCAST select MODULES_USE_ELF_RELA select GENERIC_CPU_DEVICES - select GENERIC_KERNEL_THREAD - select GENERIC_KERNEL_EXECVE ---help--- Qualcomm Hexagon is a processor architecture designed for high performance and low power across a wide variety of applications. @@ -157,9 +155,6 @@ source "mm/Kconfig" source "kernel/Kconfig.hz" -config GENERIC_GPIO - def_bool n - endmenu source "init/Kconfig" diff --git a/arch/hexagon/kernel/vm_entry.S b/arch/hexagon/kernel/vm_entry.S index e3086185fc9f..67c6ccc14770 100644 --- a/arch/hexagon/kernel/vm_entry.S +++ b/arch/hexagon/kernel/vm_entry.S @@ -291,12 +291,12 @@ event_dispatch: /* "Nested control path" -- if the previous mode was kernel */ { R0 = memw(R29 + #_PT_ER_VMEST); - R16.L = #LO(do_work_pending); + R26.L = #LO(do_work_pending); } { P0 = tstbit(R0, #HVM_VMEST_UM_SFT); if (!P0.new) jump:nt restore_all; - R16.H = #HI(do_work_pending); + R26.H = #HI(do_work_pending); R0 = #VM_INT_DISABLE; } @@ -304,7 +304,7 @@ event_dispatch: * Check also the return from fork/system call, normally coming back from * user mode * - * R16 needs to have do_work_pending, and R0 should have VM_INT_DISABLE + * R26 needs to have do_work_pending, and R0 should have VM_INT_DISABLE */ check_work_pending: @@ -313,7 +313,7 @@ check_work_pending: { R0 = R29; /* regs should still be at top of stack */ R1 = memw(THREADINFO_REG + #_THREAD_INFO_FLAGS); - callr R16; + callr R26; } { @@ -375,11 +375,11 @@ _K_enter_debug: ret_from_fork: { call schedule_tail - R16.H = #HI(do_work_pending); + R26.H = #HI(do_work_pending); } { P0 = cmp.eq(R24, #0); - R16.L = #LO(do_work_pending); + R26.L = #LO(do_work_pending); R0 = #VM_INT_DISABLE; } if P0 jump check_work_pending diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index d393f841ff5a..1a2b7749b047 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -101,9 +101,6 @@ config GENERIC_CALIBRATE_DELAY config HAVE_SETUP_PER_CPU_AREA def_bool y -config GENERIC_GPIO - bool - config DMI bool default y diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig index 6de813370b8c..821170e5f6ed 100644 --- a/arch/m68k/Kconfig +++ b/arch/m68k/Kconfig @@ -35,9 +35,6 @@ config ARCH_HAS_ILOG2_U32 config ARCH_HAS_ILOG2_U64 bool -config GENERIC_GPIO - bool - config GENERIC_HWEIGHT bool default y diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu index b1cfff832fb5..33013dfcd3e1 100644 --- a/arch/m68k/Kconfig.cpu +++ b/arch/m68k/Kconfig.cpu @@ -22,8 +22,7 @@ config M68KCLASSIC config COLDFIRE bool "Coldfire CPU family support" - select GENERIC_GPIO - select ARCH_WANT_OPTIONAL_GPIOLIB + select ARCH_REQUIRE_GPIOLIB select ARCH_HAVE_CUSTOM_GPIO_H select CPU_HAS_NO_BITFIELDS select CPU_HAS_NO_MULDIV64 @@ -224,13 +223,25 @@ config M5307 help Motorola ColdFire 5307 processor support. +config M53xx + bool + config M532x bool "MCF532x" depends on !MMU + select M53xx select HAVE_CACHE_CB help Freescale (Motorola) ColdFire 532x processor support. +config M537x + bool "MCF537x" + depends on !MMU + select M53xx + select HAVE_CACHE_CB + help + Freescale ColdFire 537x processor support. + config M5407 bool "MCF5407" depends on !MMU diff --git a/arch/m68k/Kconfig.machine b/arch/m68k/Kconfig.machine index 7240584d3439..b9ab0a69561c 100644 --- a/arch/m68k/Kconfig.machine +++ b/arch/m68k/Kconfig.machine @@ -358,6 +358,13 @@ config COBRA5329 help Support for the senTec COBRA5329 board. +config M5373EVB + bool "Freescale M5373EVB board support" + depends on M537x + select FREESCALE + help + Support for the Freescale M5373EVB board. + config M5407C3 bool "Motorola M5407C3 board support" depends on M5407 @@ -539,15 +546,6 @@ config ROMVEC 68000 type variants the vectors are at the base of the boot device on system startup. -config ROMVECSIZE - hex "Size of ROM vector region (in bytes)" - default "0x400" - depends on ROM - help - Define the size of the vector region in ROM. For most 68000 - variants this would be 0x400 bytes in size. Set to 0 if you do - not want a vector region at the start of the ROM. - config ROMSTART hex "Address of the base of system image in ROM" default "0x400" diff --git a/arch/m68k/Makefile b/arch/m68k/Makefile index 2f02acfb8edf..7f7830f2c5bc 100644 --- a/arch/m68k/Makefile +++ b/arch/m68k/Makefile @@ -45,6 +45,7 @@ cpuflags-$(CONFIG_M5441x) := $(call cc-option,-mcpu=54455,-mcfv4e) cpuflags-$(CONFIG_M54xx) := $(call cc-option,-mcpu=5475,-m5200) cpuflags-$(CONFIG_M5407) := $(call cc-option,-mcpu=5407,-m5200) cpuflags-$(CONFIG_M532x) := $(call cc-option,-mcpu=532x,-m5307) +cpuflags-$(CONFIG_M537x) := $(call cc-option,-mcpu=537x,-m5307) cpuflags-$(CONFIG_M5307) := $(call cc-option,-mcpu=5307,-m5200) cpuflags-$(CONFIG_M528x) := $(call cc-option,-mcpu=528x,-m5307) cpuflags-$(CONFIG_M5275) := $(call cc-option,-mcpu=5275,-m5307) diff --git a/arch/m68k/include/asm/dbg.h b/arch/m68k/include/asm/dbg.h deleted file mode 100644 index 27af3270f671..000000000000 --- a/arch/m68k/include/asm/dbg.h +++ /dev/null @@ -1,6 +0,0 @@ -#define DEBUG 1 -#ifdef CONFIG_COLDFIRE -#define BREAK asm volatile ("halt") -#else -#define BREAK *(volatile unsigned char *)0xdeadbee0 = 0 -#endif diff --git a/arch/m68k/include/asm/dma.h b/arch/m68k/include/asm/dma.h index 0ff3fc6a6d9a..429fe26e320c 100644 --- a/arch/m68k/include/asm/dma.h +++ b/arch/m68k/include/asm/dma.h @@ -39,7 +39,7 @@ #define MAX_M68K_DMA_CHANNELS 4 #elif defined(CONFIG_M5272) #define MAX_M68K_DMA_CHANNELS 1 -#elif defined(CONFIG_M532x) +#elif defined(CONFIG_M53xx) #define MAX_M68K_DMA_CHANNELS 0 #else #define MAX_M68K_DMA_CHANNELS 2 diff --git a/arch/m68k/include/asm/m532xsim.h b/arch/m68k/include/asm/m532xsim.h deleted file mode 100644 index 8668e47ced0e..000000000000 --- a/arch/m68k/include/asm/m532xsim.h +++ /dev/null @@ -1,1241 +0,0 @@ -/****************************************************************************/ - -/* - * m532xsim.h -- ColdFire 5329 registers - */ - -/****************************************************************************/ -#ifndef m532xsim_h -#define m532xsim_h -/****************************************************************************/ - -#define CPU_NAME "COLDFIRE(m532x)" -#define CPU_INSTR_PER_JIFFY 3 -#define MCF_BUSCLK (MCF_CLK / 3) - -#include - -#define MCFINT_VECBASE 64 -#define MCFINT_UART0 26 /* Interrupt number for UART0 */ -#define MCFINT_UART1 27 /* Interrupt number for UART1 */ -#define MCFINT_UART2 28 /* Interrupt number for UART2 */ -#define MCFINT_QSPI 31 /* Interrupt number for QSPI */ -#define MCFINT_FECRX0 36 /* Interrupt number for FEC */ -#define MCFINT_FECTX0 40 /* Interrupt number for FEC */ -#define MCFINT_FECENTC0 42 /* Interrupt number for FEC */ - -#define MCF_IRQ_UART0 (MCFINT_VECBASE + MCFINT_UART0) -#define MCF_IRQ_UART1 (MCFINT_VECBASE + MCFINT_UART1) -#define MCF_IRQ_UART2 (MCFINT_VECBASE + MCFINT_UART2) - -#define MCF_IRQ_FECRX0 (MCFINT_VECBASE + MCFINT_FECRX0) -#define MCF_IRQ_FECTX0 (MCFINT_VECBASE + MCFINT_FECTX0) -#define MCF_IRQ_FECENTC0 (MCFINT_VECBASE + MCFINT_FECENTC0) - -#define MCF_IRQ_QSPI (MCFINT_VECBASE + MCFINT_QSPI) - -#define MCF_WTM_WCR 0xFC098000 - -/* - * Define the 532x SIM register set addresses. - */ -#define MCFSIM_IPRL 0xFC048004 -#define MCFSIM_IPRH 0xFC048000 -#define MCFSIM_IPR MCFSIM_IPRL -#define MCFSIM_IMRL 0xFC04800C -#define MCFSIM_IMRH 0xFC048008 -#define MCFSIM_IMR MCFSIM_IMRL -#define MCFSIM_ICR0 0xFC048040 -#define MCFSIM_ICR1 0xFC048041 -#define MCFSIM_ICR2 0xFC048042 -#define MCFSIM_ICR3 0xFC048043 -#define MCFSIM_ICR4 0xFC048044 -#define MCFSIM_ICR5 0xFC048045 -#define MCFSIM_ICR6 0xFC048046 -#define MCFSIM_ICR7 0xFC048047 -#define MCFSIM_ICR8 0xFC048048 -#define MCFSIM_ICR9 0xFC048049 -#define MCFSIM_ICR10 0xFC04804A -#define MCFSIM_ICR11 0xFC04804B - -/* - * Some symbol defines for the above... - */ -#define MCFSIM_SWDICR MCFSIM_ICR0 /* Watchdog timer ICR */ -#define MCFSIM_TIMER1ICR MCFSIM_ICR1 /* Timer 1 ICR */ -#define MCFSIM_TIMER2ICR MCFSIM_ICR2 /* Timer 2 ICR */ -#define MCFSIM_UART1ICR MCFSIM_ICR4 /* UART 1 ICR */ -#define MCFSIM_UART2ICR MCFSIM_ICR5 /* UART 2 ICR */ -#define MCFSIM_DMA0ICR MCFSIM_ICR6 /* DMA 0 ICR */ -#define MCFSIM_DMA1ICR MCFSIM_ICR7 /* DMA 1 ICR */ -#define MCFSIM_DMA2ICR MCFSIM_ICR8 /* DMA 2 ICR */ -#define MCFSIM_DMA3ICR MCFSIM_ICR9 /* DMA 3 ICR */ - - -#define MCFINTC0_SIMR 0xFC04801C -#define MCFINTC0_CIMR 0xFC04801D -#define MCFINTC0_ICR0 0xFC048040 -#define MCFINTC1_SIMR 0xFC04C01C -#define MCFINTC1_CIMR 0xFC04C01D -#define MCFINTC1_ICR0 0xFC04C040 -#define MCFINTC2_SIMR (0) -#define MCFINTC2_CIMR (0) -#define MCFINTC2_ICR0 (0) - -#define MCFSIM_ICR_TIMER1 (0xFC048040+32) -#define MCFSIM_ICR_TIMER2 (0xFC048040+33) - -/* - * Define system peripheral IRQ usage. - */ -#define MCF_IRQ_TIMER (64 + 32) /* Timer0 */ -#define MCF_IRQ_PROFILER (64 + 33) /* Timer1 */ - -/* - * UART module. - */ -#define MCFUART_BASE0 0xFC060000 /* Base address of UART1 */ -#define MCFUART_BASE1 0xFC064000 /* Base address of UART2 */ -#define MCFUART_BASE2 0xFC068000 /* Base address of UART3 */ - -/* - * FEC module. - */ -#define MCFFEC_BASE0 0xFC030000 /* Base address of FEC0 */ -#define MCFFEC_SIZE0 0x800 /* Size of FEC0 region */ - -/* - * QSPI module. - */ -#define MCFQSPI_BASE 0xFC058000 /* Base address of QSPI */ -#define MCFQSPI_SIZE 0x40 /* Size of QSPI region */ - -#define MCFQSPI_CS0 84 -#define MCFQSPI_CS1 85 -#define MCFQSPI_CS2 86 - -/* - * Timer module. - */ -#define MCFTIMER_BASE1 0xFC070000 /* Base address of TIMER1 */ -#define MCFTIMER_BASE2 0xFC074000 /* Base address of TIMER2 */ -#define MCFTIMER_BASE3 0xFC078000 /* Base address of TIMER3 */ -#define MCFTIMER_BASE4 0xFC07C000 /* Base address of TIMER4 */ - -/********************************************************************* - * - * Reset Controller Module - * - *********************************************************************/ - -#define MCF_RCR 0xFC0A0000 -#define MCF_RSR 0xFC0A0001 - -#define MCF_RCR_SWRESET 0x80 /* Software reset bit */ -#define MCF_RCR_FRCSTOUT 0x40 /* Force external reset */ - - -/* - * Power Management - */ -#define MCFPM_WCR 0xfc040013 -#define MCFPM_PPMSR0 0xfc04002c -#define MCFPM_PPMCR0 0xfc04002d -#define MCFPM_PPMSR1 0xfc04002e -#define MCFPM_PPMCR1 0xfc04002f -#define MCFPM_PPMHR0 0xfc040030 -#define MCFPM_PPMLR0 0xfc040034 -#define MCFPM_PPMHR1 0xfc040038 -#define MCFPM_LPCR 0xec090007 - -/* - * The M5329EVB board needs a help getting its devices initialized - * at kernel start time if dBUG doesn't set it up (for example - * it is not used), so we need to do it manually. - */ -#ifdef __ASSEMBLER__ -.macro m5329EVB_setup - movel #0xFC098000, %a7 - movel #0x0, (%a7) -#define CORE_SRAM 0x80000000 -#define CORE_SRAM_SIZE 0x8000 - movel #CORE_SRAM, %d0 - addl #0x221, %d0 - movec %d0,%RAMBAR1 - movel #CORE_SRAM, %sp - addl #CORE_SRAM_SIZE, %sp - jsr sysinit -.endm -#define PLATFORM_SETUP m5329EVB_setup - -#endif /* __ASSEMBLER__ */ - -/********************************************************************* - * - * Chip Configuration Module (CCM) - * - *********************************************************************/ - -/* Register read/write macros */ -#define MCF_CCM_CCR 0xFC0A0004 -#define MCF_CCM_RCON 0xFC0A0008 -#define MCF_CCM_CIR 0xFC0A000A -#define MCF_CCM_MISCCR 0xFC0A0010 -#define MCF_CCM_CDR 0xFC0A0012 -#define MCF_CCM_UHCSR 0xFC0A0014 -#define MCF_CCM_UOCSR 0xFC0A0016 - -/* Bit definitions and macros for MCF_CCM_CCR */ -#define MCF_CCM_CCR_RESERVED (0x0001) -#define MCF_CCM_CCR_PLL_MODE (0x0003) -#define MCF_CCM_CCR_OSC_MODE (0x0005) -#define MCF_CCM_CCR_BOOTPS(x) (((x)&0x0003)<<3|0x0001) -#define MCF_CCM_CCR_LOAD (0x0021) -#define MCF_CCM_CCR_LIMP (0x0041) -#define MCF_CCM_CCR_CSC(x) (((x)&0x0003)<<8|0x0001) - -/* Bit definitions and macros for MCF_CCM_RCON */ -#define MCF_CCM_RCON_RESERVED (0x0001) -#define MCF_CCM_RCON_PLL_MODE (0x0003) -#define MCF_CCM_RCON_OSC_MODE (0x0005) -#define MCF_CCM_RCON_BOOTPS(x) (((x)&0x0003)<<3|0x0001) -#define MCF_CCM_RCON_LOAD (0x0021) -#define MCF_CCM_RCON_LIMP (0x0041) -#define MCF_CCM_RCON_CSC(x) (((x)&0x0003)<<8|0x0001) - -/* Bit definitions and macros for MCF_CCM_CIR */ -#define MCF_CCM_CIR_PRN(x) (((x)&0x003F)<<0) -#define MCF_CCM_CIR_PIN(x) (((x)&0x03FF)<<6) - -/* Bit definitions and macros for MCF_CCM_MISCCR */ -#define MCF_CCM_MISCCR_USBSRC (0x0001) -#define MCF_CCM_MISCCR_USBDIV (0x0002) -#define MCF_CCM_MISCCR_SSI_SRC (0x0010) -#define MCF_CCM_MISCCR_TIM_DMA (0x0020) -#define MCF_CCM_MISCCR_SSI_PUS (0x0040) -#define MCF_CCM_MISCCR_SSI_PUE (0x0080) -#define MCF_CCM_MISCCR_LCD_CHEN (0x0100) -#define MCF_CCM_MISCCR_LIMP (0x1000) -#define MCF_CCM_MISCCR_PLL_LOCK (0x2000) - -/* Bit definitions and macros for MCF_CCM_CDR */ -#define MCF_CCM_CDR_SSIDIV(x) (((x)&0x000F)<<0) -#define MCF_CCM_CDR_LPDIV(x) (((x)&0x000F)<<8) - -/* Bit definitions and macros for MCF_CCM_UHCSR */ -#define MCF_CCM_UHCSR_XPDE (0x0001) -#define MCF_CCM_UHCSR_UHMIE (0x0002) -#define MCF_CCM_UHCSR_WKUP (0x0004) -#define MCF_CCM_UHCSR_PORTIND(x) (((x)&0x0003)<<14) - -/* Bit definitions and macros for MCF_CCM_UOCSR */ -#define MCF_CCM_UOCSR_XPDE (0x0001) -#define MCF_CCM_UOCSR_UOMIE (0x0002) -#define MCF_CCM_UOCSR_WKUP (0x0004) -#define MCF_CCM_UOCSR_PWRFLT (0x0008) -#define MCF_CCM_UOCSR_SEND (0x0010) -#define MCF_CCM_UOCSR_VVLD (0x0020) -#define MCF_CCM_UOCSR_BVLD (0x0040) -#define MCF_CCM_UOCSR_AVLD (0x0080) -#define MCF_CCM_UOCSR_DPPU (0x0100) -#define MCF_CCM_UOCSR_DCR_VBUS (0x0200) -#define MCF_CCM_UOCSR_CRG_VBUS (0x0400) -#define MCF_CCM_UOCSR_DRV_VBUS (0x0800) -#define MCF_CCM_UOCSR_DMPD (0x1000) -#define MCF_CCM_UOCSR_DPPD (0x2000) -#define MCF_CCM_UOCSR_PORTIND(x) (((x)&0x0003)<<14) - -/********************************************************************* - * - * FlexBus Chip Selects (FBCS) - * - *********************************************************************/ - -/* Register read/write macros */ -#define MCF_FBCS0_CSAR 0xFC008000 -#define MCF_FBCS0_CSMR 0xFC008004 -#define MCF_FBCS0_CSCR 0xFC008008 -#define MCF_FBCS1_CSAR 0xFC00800C -#define MCF_FBCS1_CSMR 0xFC008010 -#define MCF_FBCS1_CSCR 0xFC008014 -#define MCF_FBCS2_CSAR 0xFC008018 -#define MCF_FBCS2_CSMR 0xFC00801C -#define MCF_FBCS2_CSCR 0xFC008020 -#define MCF_FBCS3_CSAR 0xFC008024 -#define MCF_FBCS3_CSMR 0xFC008028 -#define MCF_FBCS3_CSCR 0xFC00802C -#define MCF_FBCS4_CSAR 0xFC008030 -#define MCF_FBCS4_CSMR 0xFC008034 -#define MCF_FBCS4_CSCR 0xFC008038 -#define MCF_FBCS5_CSAR 0xFC00803C -#define MCF_FBCS5_CSMR 0xFC008040 -#define MCF_FBCS5_CSCR 0xFC008044 - -/* Bit definitions and macros for MCF_FBCS_CSAR */ -#define MCF_FBCS_CSAR_BA(x) ((x)&0xFFFF0000) - -/* Bit definitions and macros for MCF_FBCS_CSMR */ -#define MCF_FBCS_CSMR_V (0x00000001) -#define MCF_FBCS_CSMR_WP (0x00000100) -#define MCF_FBCS_CSMR_BAM(x) (((x)&0x0000FFFF)<<16) -#define MCF_FBCS_CSMR_BAM_4G (0xFFFF0000) -#define MCF_FBCS_CSMR_BAM_2G (0x7FFF0000) -#define MCF_FBCS_CSMR_BAM_1G (0x3FFF0000) -#define MCF_FBCS_CSMR_BAM_1024M (0x3FFF0000) -#define MCF_FBCS_CSMR_BAM_512M (0x1FFF0000) -#define MCF_FBCS_CSMR_BAM_256M (0x0FFF0000) -#define MCF_FBCS_CSMR_BAM_128M (0x07FF0000) -#define MCF_FBCS_CSMR_BAM_64M (0x03FF0000) -#define MCF_FBCS_CSMR_BAM_32M (0x01FF0000) -#define MCF_FBCS_CSMR_BAM_16M (0x00FF0000) -#define MCF_FBCS_CSMR_BAM_8M (0x007F0000) -#define MCF_FBCS_CSMR_BAM_4M (0x003F0000) -#define MCF_FBCS_CSMR_BAM_2M (0x001F0000) -#define MCF_FBCS_CSMR_BAM_1M (0x000F0000) -#define MCF_FBCS_CSMR_BAM_1024K (0x000F0000) -#define MCF_FBCS_CSMR_BAM_512K (0x00070000) -#define MCF_FBCS_CSMR_BAM_256K (0x00030000) -#define MCF_FBCS_CSMR_BAM_128K (0x00010000) -#define MCF_FBCS_CSMR_BAM_64K (0x00000000) - -/* Bit definitions and macros for MCF_FBCS_CSCR */ -#define MCF_FBCS_CSCR_BSTW (0x00000008) -#define MCF_FBCS_CSCR_BSTR (0x00000010) -#define MCF_FBCS_CSCR_BEM (0x00000020) -#define MCF_FBCS_CSCR_PS(x) (((x)&0x00000003)<<6) -#define MCF_FBCS_CSCR_AA (0x00000100) -#define MCF_FBCS_CSCR_SBM (0x00000200) -#define MCF_FBCS_CSCR_WS(x) (((x)&0x0000003F)<<10) -#define MCF_FBCS_CSCR_WRAH(x) (((x)&0x00000003)<<16) -#define MCF_FBCS_CSCR_RDAH(x) (((x)&0x00000003)<<18) -#define MCF_FBCS_CSCR_ASET(x) (((x)&0x00000003)<<20) -#define MCF_FBCS_CSCR_SWSEN (0x00800000) -#define MCF_FBCS_CSCR_SWS(x) (((x)&0x0000003F)<<26) -#define MCF_FBCS_CSCR_PS_8 (0x0040) -#define MCF_FBCS_CSCR_PS_16 (0x0080) -#define MCF_FBCS_CSCR_PS_32 (0x0000) - -/********************************************************************* - * - * General Purpose I/O (GPIO) - * - *********************************************************************/ - -/* Register read/write macros */ -#define MCFGPIO_PODR_FECH (0xFC0A4000) -#define MCFGPIO_PODR_FECL (0xFC0A4001) -#define MCFGPIO_PODR_SSI (0xFC0A4002) -#define MCFGPIO_PODR_BUSCTL (0xFC0A4003) -#define MCFGPIO_PODR_BE (0xFC0A4004) -#define MCFGPIO_PODR_CS (0xFC0A4005) -#define MCFGPIO_PODR_PWM (0xFC0A4006) -#define MCFGPIO_PODR_FECI2C (0xFC0A4007) -#define MCFGPIO_PODR_UART (0xFC0A4009) -#define MCFGPIO_PODR_QSPI (0xFC0A400A) -#define MCFGPIO_PODR_TIMER (0xFC0A400B) -#define MCFGPIO_PODR_LCDDATAH (0xFC0A400D) -#define MCFGPIO_PODR_LCDDATAM (0xFC0A400E) -#define MCFGPIO_PODR_LCDDATAL (0xFC0A400F) -#define MCFGPIO_PODR_LCDCTLH (0xFC0A4010) -#define MCFGPIO_PODR_LCDCTLL (0xFC0A4011) -#define MCFGPIO_PDDR_FECH (0xFC0A4014) -#define MCFGPIO_PDDR_FECL (0xFC0A4015) -#define MCFGPIO_PDDR_SSI (0xFC0A4016) -#define MCFGPIO_PDDR_BUSCTL (0xFC0A4017) -#define MCFGPIO_PDDR_BE (0xFC0A4018) -#define MCFGPIO_PDDR_CS (0xFC0A4019) -#define MCFGPIO_PDDR_PWM (0xFC0A401A) -#define MCFGPIO_PDDR_FECI2C (0xFC0A401B) -#define MCFGPIO_PDDR_UART (0xFC0A401C) -#define MCFGPIO_PDDR_QSPI (0xFC0A401E) -#define MCFGPIO_PDDR_TIMER (0xFC0A401F) -#define MCFGPIO_PDDR_LCDDATAH (0xFC0A4021) -#define MCFGPIO_PDDR_LCDDATAM (0xFC0A4022) -#define MCFGPIO_PDDR_LCDDATAL (0xFC0A4023) -#define MCFGPIO_PDDR_LCDCTLH (0xFC0A4024) -#define MCFGPIO_PDDR_LCDCTLL (0xFC0A4025) -#define MCFGPIO_PPDSDR_FECH (0xFC0A4028) -#define MCFGPIO_PPDSDR_FECL (0xFC0A4029) -#define MCFGPIO_PPDSDR_SSI (0xFC0A402A) -#define MCFGPIO_PPDSDR_BUSCTL (0xFC0A402B) -#define MCFGPIO_PPDSDR_BE (0xFC0A402C) -#define MCFGPIO_PPDSDR_CS (0xFC0A402D) -#define MCFGPIO_PPDSDR_PWM (0xFC0A402E) -#define MCFGPIO_PPDSDR_FECI2C (0xFC0A402F) -#define MCFGPIO_PPDSDR_UART (0xFC0A4031) -#define MCFGPIO_PPDSDR_QSPI (0xFC0A4032) -#define MCFGPIO_PPDSDR_TIMER (0xFC0A4033) -#define MCFGPIO_PPDSDR_LCDDATAH (0xFC0A4035) -#define MCFGPIO_PPDSDR_LCDDATAM (0xFC0A4036) -#define MCFGPIO_PPDSDR_LCDDATAL (0xFC0A4037) -#define MCFGPIO_PPDSDR_LCDCTLH (0xFC0A4038) -#define MCFGPIO_PPDSDR_LCDCTLL (0xFC0A4039) -#define MCFGPIO_PCLRR_FECH (0xFC0A403C) -#define MCFGPIO_PCLRR_FECL (0xFC0A403D) -#define MCFGPIO_PCLRR_SSI (0xFC0A403E) -#define MCFGPIO_PCLRR_BUSCTL (0xFC0A403F) -#define MCFGPIO_PCLRR_BE (0xFC0A4040) -#define MCFGPIO_PCLRR_CS (0xFC0A4041) -#define MCFGPIO_PCLRR_PWM (0xFC0A4042) -#define MCFGPIO_PCLRR_FECI2C (0xFC0A4043) -#define MCFGPIO_PCLRR_UART (0xFC0A4045) -#define MCFGPIO_PCLRR_QSPI (0xFC0A4046) -#define MCFGPIO_PCLRR_TIMER (0xFC0A4047) -#define MCFGPIO_PCLRR_LCDDATAH (0xFC0A4049) -#define MCFGPIO_PCLRR_LCDDATAM (0xFC0A404A) -#define MCFGPIO_PCLRR_LCDDATAL (0xFC0A404B) -#define MCFGPIO_PCLRR_LCDCTLH (0xFC0A404C) -#define MCFGPIO_PCLRR_LCDCTLL (0xFC0A404D) -#define MCFGPIO_PAR_FEC (0xFC0A4050) -#define MCFGPIO_PAR_PWM (0xFC0A4051) -#define MCFGPIO_PAR_BUSCTL (0xFC0A4052) -#define MCFGPIO_PAR_FECI2C (0xFC0A4053) -#define MCFGPIO_PAR_BE (0xFC0A4054) -#define MCFGPIO_PAR_CS (0xFC0A4055) -#define MCFGPIO_PAR_SSI (0xFC0A4056) -#define MCFGPIO_PAR_UART (0xFC0A4058) -#define MCFGPIO_PAR_QSPI (0xFC0A405A) -#define MCFGPIO_PAR_TIMER (0xFC0A405C) -#define MCFGPIO_PAR_LCDDATA (0xFC0A405D) -#define MCFGPIO_PAR_LCDCTL (0xFC0A405E) -#define MCFGPIO_PAR_IRQ (0xFC0A4060) -#define MCFGPIO_MSCR_FLEXBUS (0xFC0A4064) -#define MCFGPIO_MSCR_SDRAM (0xFC0A4065) -#define MCFGPIO_DSCR_I2C (0xFC0A4068) -#define MCFGPIO_DSCR_PWM (0xFC0A4069) -#define MCFGPIO_DSCR_FEC (0xFC0A406A) -#define MCFGPIO_DSCR_UART (0xFC0A406B) -#define MCFGPIO_DSCR_QSPI (0xFC0A406C) -#define MCFGPIO_DSCR_TIMER (0xFC0A406D) -#define MCFGPIO_DSCR_SSI (0xFC0A406E) -#define MCFGPIO_DSCR_LCD (0xFC0A406F) -#define MCFGPIO_DSCR_DEBUG (0xFC0A4070) -#define MCFGPIO_DSCR_CLKRST (0xFC0A4071) -#define MCFGPIO_DSCR_IRQ (0xFC0A4072) - -/* Bit definitions and macros for MCF_GPIO_PODR_FECH */ -#define MCF_GPIO_PODR_FECH_PODR_FECH0 (0x01) -#define MCF_GPIO_PODR_FECH_PODR_FECH1 (0x02) -#define MCF_GPIO_PODR_FECH_PODR_FECH2 (0x04) -#define MCF_GPIO_PODR_FECH_PODR_FECH3 (0x08) -#define MCF_GPIO_PODR_FECH_PODR_FECH4 (0x10) -#define MCF_GPIO_PODR_FECH_PODR_FECH5 (0x20) -#define MCF_GPIO_PODR_FECH_PODR_FECH6 (0x40) -#define MCF_GPIO_PODR_FECH_PODR_FECH7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PODR_FECL */ -#define MCF_GPIO_PODR_FECL_PODR_FECL0 (0x01) -#define MCF_GPIO_PODR_FECL_PODR_FECL1 (0x02) -#define MCF_GPIO_PODR_FECL_PODR_FECL2 (0x04) -#define MCF_GPIO_PODR_FECL_PODR_FECL3 (0x08) -#define MCF_GPIO_PODR_FECL_PODR_FECL4 (0x10) -#define MCF_GPIO_PODR_FECL_PODR_FECL5 (0x20) -#define MCF_GPIO_PODR_FECL_PODR_FECL6 (0x40) -#define MCF_GPIO_PODR_FECL_PODR_FECL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PODR_SSI */ -#define MCF_GPIO_PODR_SSI_PODR_SSI0 (0x01) -#define MCF_GPIO_PODR_SSI_PODR_SSI1 (0x02) -#define MCF_GPIO_PODR_SSI_PODR_SSI2 (0x04) -#define MCF_GPIO_PODR_SSI_PODR_SSI3 (0x08) -#define MCF_GPIO_PODR_SSI_PODR_SSI4 (0x10) - -/* Bit definitions and macros for MCF_GPIO_PODR_BUSCTL */ -#define MCF_GPIO_PODR_BUSCTL_POSDR_BUSCTL0 (0x01) -#define MCF_GPIO_PODR_BUSCTL_PODR_BUSCTL1 (0x02) -#define MCF_GPIO_PODR_BUSCTL_PODR_BUSCTL2 (0x04) -#define MCF_GPIO_PODR_BUSCTL_PODR_BUSCTL3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PODR_BE */ -#define MCF_GPIO_PODR_BE_PODR_BE0 (0x01) -#define MCF_GPIO_PODR_BE_PODR_BE1 (0x02) -#define MCF_GPIO_PODR_BE_PODR_BE2 (0x04) -#define MCF_GPIO_PODR_BE_PODR_BE3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PODR_CS */ -#define MCF_GPIO_PODR_CS_PODR_CS1 (0x02) -#define MCF_GPIO_PODR_CS_PODR_CS2 (0x04) -#define MCF_GPIO_PODR_CS_PODR_CS3 (0x08) -#define MCF_GPIO_PODR_CS_PODR_CS4 (0x10) -#define MCF_GPIO_PODR_CS_PODR_CS5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PODR_PWM */ -#define MCF_GPIO_PODR_PWM_PODR_PWM2 (0x04) -#define MCF_GPIO_PODR_PWM_PODR_PWM3 (0x08) -#define MCF_GPIO_PODR_PWM_PODR_PWM4 (0x10) -#define MCF_GPIO_PODR_PWM_PODR_PWM5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PODR_FECI2C */ -#define MCF_GPIO_PODR_FECI2C_PODR_FECI2C0 (0x01) -#define MCF_GPIO_PODR_FECI2C_PODR_FECI2C1 (0x02) -#define MCF_GPIO_PODR_FECI2C_PODR_FECI2C2 (0x04) -#define MCF_GPIO_PODR_FECI2C_PODR_FECI2C3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PODR_UART */ -#define MCF_GPIO_PODR_UART_PODR_UART0 (0x01) -#define MCF_GPIO_PODR_UART_PODR_UART1 (0x02) -#define MCF_GPIO_PODR_UART_PODR_UART2 (0x04) -#define MCF_GPIO_PODR_UART_PODR_UART3 (0x08) -#define MCF_GPIO_PODR_UART_PODR_UART4 (0x10) -#define MCF_GPIO_PODR_UART_PODR_UART5 (0x20) -#define MCF_GPIO_PODR_UART_PODR_UART6 (0x40) -#define MCF_GPIO_PODR_UART_PODR_UART7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PODR_QSPI */ -#define MCF_GPIO_PODR_QSPI_PODR_QSPI0 (0x01) -#define MCF_GPIO_PODR_QSPI_PODR_QSPI1 (0x02) -#define MCF_GPIO_PODR_QSPI_PODR_QSPI2 (0x04) -#define MCF_GPIO_PODR_QSPI_PODR_QSPI3 (0x08) -#define MCF_GPIO_PODR_QSPI_PODR_QSPI4 (0x10) -#define MCF_GPIO_PODR_QSPI_PODR_QSPI5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PODR_TIMER */ -#define MCF_GPIO_PODR_TIMER_PODR_TIMER0 (0x01) -#define MCF_GPIO_PODR_TIMER_PODR_TIMER1 (0x02) -#define MCF_GPIO_PODR_TIMER_PODR_TIMER2 (0x04) -#define MCF_GPIO_PODR_TIMER_PODR_TIMER3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PODR_LCDDATAH */ -#define MCF_GPIO_PODR_LCDDATAH_PODR_LCDDATAH0 (0x01) -#define MCF_GPIO_PODR_LCDDATAH_PODR_LCDDATAH1 (0x02) - -/* Bit definitions and macros for MCF_GPIO_PODR_LCDDATAM */ -#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM0 (0x01) -#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM1 (0x02) -#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM2 (0x04) -#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM3 (0x08) -#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM4 (0x10) -#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM5 (0x20) -#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM6 (0x40) -#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PODR_LCDDATAL */ -#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL0 (0x01) -#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL1 (0x02) -#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL2 (0x04) -#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL3 (0x08) -#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL4 (0x10) -#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL5 (0x20) -#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL6 (0x40) -#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PODR_LCDCTLH */ -#define MCF_GPIO_PODR_LCDCTLH_PODR_LCDCTLH0 (0x01) - -/* Bit definitions and macros for MCF_GPIO_PODR_LCDCTLL */ -#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL0 (0x01) -#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL1 (0x02) -#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL2 (0x04) -#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL3 (0x08) -#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL4 (0x10) -#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL5 (0x20) -#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL6 (0x40) -#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PDDR_FECH */ -#define MCF_GPIO_PDDR_FECH_PDDR_FECH0 (0x01) -#define MCF_GPIO_PDDR_FECH_PDDR_FECH1 (0x02) -#define MCF_GPIO_PDDR_FECH_PDDR_FECH2 (0x04) -#define MCF_GPIO_PDDR_FECH_PDDR_FECH3 (0x08) -#define MCF_GPIO_PDDR_FECH_PDDR_FECH4 (0x10) -#define MCF_GPIO_PDDR_FECH_PDDR_FECH5 (0x20) -#define MCF_GPIO_PDDR_FECH_PDDR_FECH6 (0x40) -#define MCF_GPIO_PDDR_FECH_PDDR_FECH7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PDDR_FECL */ -#define MCF_GPIO_PDDR_FECL_PDDR_FECL0 (0x01) -#define MCF_GPIO_PDDR_FECL_PDDR_FECL1 (0x02) -#define MCF_GPIO_PDDR_FECL_PDDR_FECL2 (0x04) -#define MCF_GPIO_PDDR_FECL_PDDR_FECL3 (0x08) -#define MCF_GPIO_PDDR_FECL_PDDR_FECL4 (0x10) -#define MCF_GPIO_PDDR_FECL_PDDR_FECL5 (0x20) -#define MCF_GPIO_PDDR_FECL_PDDR_FECL6 (0x40) -#define MCF_GPIO_PDDR_FECL_PDDR_FECL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PDDR_SSI */ -#define MCF_GPIO_PDDR_SSI_PDDR_SSI0 (0x01) -#define MCF_GPIO_PDDR_SSI_PDDR_SSI1 (0x02) -#define MCF_GPIO_PDDR_SSI_PDDR_SSI2 (0x04) -#define MCF_GPIO_PDDR_SSI_PDDR_SSI3 (0x08) -#define MCF_GPIO_PDDR_SSI_PDDR_SSI4 (0x10) - -/* Bit definitions and macros for MCF_GPIO_PDDR_BUSCTL */ -#define MCF_GPIO_PDDR_BUSCTL_POSDR_BUSCTL0 (0x01) -#define MCF_GPIO_PDDR_BUSCTL_PDDR_BUSCTL1 (0x02) -#define MCF_GPIO_PDDR_BUSCTL_PDDR_BUSCTL2 (0x04) -#define MCF_GPIO_PDDR_BUSCTL_PDDR_BUSCTL3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PDDR_BE */ -#define MCF_GPIO_PDDR_BE_PDDR_BE0 (0x01) -#define MCF_GPIO_PDDR_BE_PDDR_BE1 (0x02) -#define MCF_GPIO_PDDR_BE_PDDR_BE2 (0x04) -#define MCF_GPIO_PDDR_BE_PDDR_BE3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PDDR_CS */ -#define MCF_GPIO_PDDR_CS_PDDR_CS1 (0x02) -#define MCF_GPIO_PDDR_CS_PDDR_CS2 (0x04) -#define MCF_GPIO_PDDR_CS_PDDR_CS3 (0x08) -#define MCF_GPIO_PDDR_CS_PDDR_CS4 (0x10) -#define MCF_GPIO_PDDR_CS_PDDR_CS5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PDDR_PWM */ -#define MCF_GPIO_PDDR_PWM_PDDR_PWM2 (0x04) -#define MCF_GPIO_PDDR_PWM_PDDR_PWM3 (0x08) -#define MCF_GPIO_PDDR_PWM_PDDR_PWM4 (0x10) -#define MCF_GPIO_PDDR_PWM_PDDR_PWM5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PDDR_FECI2C */ -#define MCF_GPIO_PDDR_FECI2C_PDDR_FECI2C0 (0x01) -#define MCF_GPIO_PDDR_FECI2C_PDDR_FECI2C1 (0x02) -#define MCF_GPIO_PDDR_FECI2C_PDDR_FECI2C2 (0x04) -#define MCF_GPIO_PDDR_FECI2C_PDDR_FECI2C3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PDDR_UART */ -#define MCF_GPIO_PDDR_UART_PDDR_UART0 (0x01) -#define MCF_GPIO_PDDR_UART_PDDR_UART1 (0x02) -#define MCF_GPIO_PDDR_UART_PDDR_UART2 (0x04) -#define MCF_GPIO_PDDR_UART_PDDR_UART3 (0x08) -#define MCF_GPIO_PDDR_UART_PDDR_UART4 (0x10) -#define MCF_GPIO_PDDR_UART_PDDR_UART5 (0x20) -#define MCF_GPIO_PDDR_UART_PDDR_UART6 (0x40) -#define MCF_GPIO_PDDR_UART_PDDR_UART7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PDDR_QSPI */ -#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI0 (0x01) -#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI1 (0x02) -#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI2 (0x04) -#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI3 (0x08) -#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI4 (0x10) -#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PDDR_TIMER */ -#define MCF_GPIO_PDDR_TIMER_PDDR_TIMER0 (0x01) -#define MCF_GPIO_PDDR_TIMER_PDDR_TIMER1 (0x02) -#define MCF_GPIO_PDDR_TIMER_PDDR_TIMER2 (0x04) -#define MCF_GPIO_PDDR_TIMER_PDDR_TIMER3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PDDR_LCDDATAH */ -#define MCF_GPIO_PDDR_LCDDATAH_PDDR_LCDDATAH0 (0x01) -#define MCF_GPIO_PDDR_LCDDATAH_PDDR_LCDDATAH1 (0x02) - -/* Bit definitions and macros for MCF_GPIO_PDDR_LCDDATAM */ -#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM0 (0x01) -#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM1 (0x02) -#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM2 (0x04) -#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM3 (0x08) -#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM4 (0x10) -#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM5 (0x20) -#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM6 (0x40) -#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PDDR_LCDDATAL */ -#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL0 (0x01) -#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL1 (0x02) -#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL2 (0x04) -#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL3 (0x08) -#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL4 (0x10) -#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL5 (0x20) -#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL6 (0x40) -#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PDDR_LCDCTLH */ -#define MCF_GPIO_PDDR_LCDCTLH_PDDR_LCDCTLH0 (0x01) - -/* Bit definitions and macros for MCF_GPIO_PDDR_LCDCTLL */ -#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL0 (0x01) -#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL1 (0x02) -#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL2 (0x04) -#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL3 (0x08) -#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL4 (0x10) -#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL5 (0x20) -#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL6 (0x40) -#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_FECH */ -#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH0 (0x01) -#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH1 (0x02) -#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH2 (0x04) -#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH3 (0x08) -#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH4 (0x10) -#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH5 (0x20) -#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH6 (0x40) -#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_FECL */ -#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL0 (0x01) -#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL1 (0x02) -#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL2 (0x04) -#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL3 (0x08) -#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL4 (0x10) -#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL5 (0x20) -#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL6 (0x40) -#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_SSI */ -#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI0 (0x01) -#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI1 (0x02) -#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI2 (0x04) -#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI3 (0x08) -#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI4 (0x10) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_BUSCTL */ -#define MCF_GPIO_PPDSDR_BUSCTL_POSDR_BUSCTL0 (0x01) -#define MCF_GPIO_PPDSDR_BUSCTL_PPDSDR_BUSCTL1 (0x02) -#define MCF_GPIO_PPDSDR_BUSCTL_PPDSDR_BUSCTL2 (0x04) -#define MCF_GPIO_PPDSDR_BUSCTL_PPDSDR_BUSCTL3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_BE */ -#define MCF_GPIO_PPDSDR_BE_PPDSDR_BE0 (0x01) -#define MCF_GPIO_PPDSDR_BE_PPDSDR_BE1 (0x02) -#define MCF_GPIO_PPDSDR_BE_PPDSDR_BE2 (0x04) -#define MCF_GPIO_PPDSDR_BE_PPDSDR_BE3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_CS */ -#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS1 (0x02) -#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS2 (0x04) -#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS3 (0x08) -#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS4 (0x10) -#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_PWM */ -#define MCF_GPIO_PPDSDR_PWM_PPDSDR_PWM2 (0x04) -#define MCF_GPIO_PPDSDR_PWM_PPDSDR_PWM3 (0x08) -#define MCF_GPIO_PPDSDR_PWM_PPDSDR_PWM4 (0x10) -#define MCF_GPIO_PPDSDR_PWM_PPDSDR_PWM5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_FECI2C */ -#define MCF_GPIO_PPDSDR_FECI2C_PPDSDR_FECI2C0 (0x01) -#define MCF_GPIO_PPDSDR_FECI2C_PPDSDR_FECI2C1 (0x02) -#define MCF_GPIO_PPDSDR_FECI2C_PPDSDR_FECI2C2 (0x04) -#define MCF_GPIO_PPDSDR_FECI2C_PPDSDR_FECI2C3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_UART */ -#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART0 (0x01) -#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART1 (0x02) -#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART2 (0x04) -#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART3 (0x08) -#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART4 (0x10) -#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART5 (0x20) -#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART6 (0x40) -#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_QSPI */ -#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI0 (0x01) -#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI1 (0x02) -#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI2 (0x04) -#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI3 (0x08) -#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI4 (0x10) -#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_TIMER */ -#define MCF_GPIO_PPDSDR_TIMER_PPDSDR_TIMER0 (0x01) -#define MCF_GPIO_PPDSDR_TIMER_PPDSDR_TIMER1 (0x02) -#define MCF_GPIO_PPDSDR_TIMER_PPDSDR_TIMER2 (0x04) -#define MCF_GPIO_PPDSDR_TIMER_PPDSDR_TIMER3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDDATAH */ -#define MCF_GPIO_PPDSDR_LCDDATAH_PPDSDR_LCDDATAH0 (0x01) -#define MCF_GPIO_PPDSDR_LCDDATAH_PPDSDR_LCDDATAH1 (0x02) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDDATAM */ -#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM0 (0x01) -#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM1 (0x02) -#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM2 (0x04) -#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM3 (0x08) -#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM4 (0x10) -#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM5 (0x20) -#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM6 (0x40) -#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDDATAL */ -#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL0 (0x01) -#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL1 (0x02) -#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL2 (0x04) -#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL3 (0x08) -#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL4 (0x10) -#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL5 (0x20) -#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL6 (0x40) -#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDCTLH */ -#define MCF_GPIO_PPDSDR_LCDCTLH_PPDSDR_LCDCTLH0 (0x01) - -/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDCTLL */ -#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL0 (0x01) -#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL1 (0x02) -#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL2 (0x04) -#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL3 (0x08) -#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL4 (0x10) -#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL5 (0x20) -#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL6 (0x40) -#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_FECH */ -#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH0 (0x01) -#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH1 (0x02) -#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH2 (0x04) -#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH3 (0x08) -#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH4 (0x10) -#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH5 (0x20) -#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH6 (0x40) -#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_FECL */ -#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL0 (0x01) -#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL1 (0x02) -#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL2 (0x04) -#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL3 (0x08) -#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL4 (0x10) -#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL5 (0x20) -#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL6 (0x40) -#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_SSI */ -#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI0 (0x01) -#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI1 (0x02) -#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI2 (0x04) -#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI3 (0x08) -#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI4 (0x10) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_BUSCTL */ -#define MCF_GPIO_PCLRR_BUSCTL_POSDR_BUSCTL0 (0x01) -#define MCF_GPIO_PCLRR_BUSCTL_PCLRR_BUSCTL1 (0x02) -#define MCF_GPIO_PCLRR_BUSCTL_PCLRR_BUSCTL2 (0x04) -#define MCF_GPIO_PCLRR_BUSCTL_PCLRR_BUSCTL3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_BE */ -#define MCF_GPIO_PCLRR_BE_PCLRR_BE0 (0x01) -#define MCF_GPIO_PCLRR_BE_PCLRR_BE1 (0x02) -#define MCF_GPIO_PCLRR_BE_PCLRR_BE2 (0x04) -#define MCF_GPIO_PCLRR_BE_PCLRR_BE3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_CS */ -#define MCF_GPIO_PCLRR_CS_PCLRR_CS1 (0x02) -#define MCF_GPIO_PCLRR_CS_PCLRR_CS2 (0x04) -#define MCF_GPIO_PCLRR_CS_PCLRR_CS3 (0x08) -#define MCF_GPIO_PCLRR_CS_PCLRR_CS4 (0x10) -#define MCF_GPIO_PCLRR_CS_PCLRR_CS5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_PWM */ -#define MCF_GPIO_PCLRR_PWM_PCLRR_PWM2 (0x04) -#define MCF_GPIO_PCLRR_PWM_PCLRR_PWM3 (0x08) -#define MCF_GPIO_PCLRR_PWM_PCLRR_PWM4 (0x10) -#define MCF_GPIO_PCLRR_PWM_PCLRR_PWM5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_FECI2C */ -#define MCF_GPIO_PCLRR_FECI2C_PCLRR_FECI2C0 (0x01) -#define MCF_GPIO_PCLRR_FECI2C_PCLRR_FECI2C1 (0x02) -#define MCF_GPIO_PCLRR_FECI2C_PCLRR_FECI2C2 (0x04) -#define MCF_GPIO_PCLRR_FECI2C_PCLRR_FECI2C3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_UART */ -#define MCF_GPIO_PCLRR_UART_PCLRR_UART0 (0x01) -#define MCF_GPIO_PCLRR_UART_PCLRR_UART1 (0x02) -#define MCF_GPIO_PCLRR_UART_PCLRR_UART2 (0x04) -#define MCF_GPIO_PCLRR_UART_PCLRR_UART3 (0x08) -#define MCF_GPIO_PCLRR_UART_PCLRR_UART4 (0x10) -#define MCF_GPIO_PCLRR_UART_PCLRR_UART5 (0x20) -#define MCF_GPIO_PCLRR_UART_PCLRR_UART6 (0x40) -#define MCF_GPIO_PCLRR_UART_PCLRR_UART7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_QSPI */ -#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI0 (0x01) -#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI1 (0x02) -#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI2 (0x04) -#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI3 (0x08) -#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI4 (0x10) -#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI5 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_TIMER */ -#define MCF_GPIO_PCLRR_TIMER_PCLRR_TIMER0 (0x01) -#define MCF_GPIO_PCLRR_TIMER_PCLRR_TIMER1 (0x02) -#define MCF_GPIO_PCLRR_TIMER_PCLRR_TIMER2 (0x04) -#define MCF_GPIO_PCLRR_TIMER_PCLRR_TIMER3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDDATAH */ -#define MCF_GPIO_PCLRR_LCDDATAH_PCLRR_LCDDATAH0 (0x01) -#define MCF_GPIO_PCLRR_LCDDATAH_PCLRR_LCDDATAH1 (0x02) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDDATAM */ -#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM0 (0x01) -#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM1 (0x02) -#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM2 (0x04) -#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM3 (0x08) -#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM4 (0x10) -#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM5 (0x20) -#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM6 (0x40) -#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDDATAL */ -#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL0 (0x01) -#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL1 (0x02) -#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL2 (0x04) -#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL3 (0x08) -#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL4 (0x10) -#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL5 (0x20) -#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL6 (0x40) -#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDCTLH */ -#define MCF_GPIO_PCLRR_LCDCTLH_PCLRR_LCDCTLH0 (0x01) - -/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDCTLL */ -#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL0 (0x01) -#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL1 (0x02) -#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL2 (0x04) -#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL3 (0x08) -#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL4 (0x10) -#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL5 (0x20) -#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL6 (0x40) -#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL7 (0x80) - -/* Bit definitions and macros for MCF_GPIO_PAR_FEC */ -#define MCF_GPIO_PAR_FEC_PAR_FEC_MII(x) (((x)&0x03)<<0) -#define MCF_GPIO_PAR_FEC_PAR_FEC_7W(x) (((x)&0x03)<<2) -#define MCF_GPIO_PAR_FEC_PAR_FEC_7W_GPIO (0x00) -#define MCF_GPIO_PAR_FEC_PAR_FEC_7W_URTS1 (0x04) -#define MCF_GPIO_PAR_FEC_PAR_FEC_7W_FEC (0x0C) -#define MCF_GPIO_PAR_FEC_PAR_FEC_MII_GPIO (0x00) -#define MCF_GPIO_PAR_FEC_PAR_FEC_MII_UART (0x01) -#define MCF_GPIO_PAR_FEC_PAR_FEC_MII_FEC (0x03) - -/* Bit definitions and macros for MCF_GPIO_PAR_PWM */ -#define MCF_GPIO_PAR_PWM_PAR_PWM1(x) (((x)&0x03)<<0) -#define MCF_GPIO_PAR_PWM_PAR_PWM3(x) (((x)&0x03)<<2) -#define MCF_GPIO_PAR_PWM_PAR_PWM5 (0x10) -#define MCF_GPIO_PAR_PWM_PAR_PWM7 (0x20) - -/* Bit definitions and macros for MCF_GPIO_PAR_BUSCTL */ -#define MCF_GPIO_PAR_BUSCTL_PAR_TS(x) (((x)&0x03)<<3) -#define MCF_GPIO_PAR_BUSCTL_PAR_RWB (0x20) -#define MCF_GPIO_PAR_BUSCTL_PAR_TA (0x40) -#define MCF_GPIO_PAR_BUSCTL_PAR_OE (0x80) -#define MCF_GPIO_PAR_BUSCTL_PAR_OE_GPIO (0x00) -#define MCF_GPIO_PAR_BUSCTL_PAR_OE_OE (0x80) -#define MCF_GPIO_PAR_BUSCTL_PAR_TA_GPIO (0x00) -#define MCF_GPIO_PAR_BUSCTL_PAR_TA_TA (0x40) -#define MCF_GPIO_PAR_BUSCTL_PAR_RWB_GPIO (0x00) -#define MCF_GPIO_PAR_BUSCTL_PAR_RWB_RWB (0x20) -#define MCF_GPIO_PAR_BUSCTL_PAR_TS_GPIO (0x00) -#define MCF_GPIO_PAR_BUSCTL_PAR_TS_DACK0 (0x10) -#define MCF_GPIO_PAR_BUSCTL_PAR_TS_TS (0x18) - -/* Bit definitions and macros for MCF_GPIO_PAR_FECI2C */ -#define MCF_GPIO_PAR_FECI2C_PAR_SDA(x) (((x)&0x03)<<0) -#define MCF_GPIO_PAR_FECI2C_PAR_SCL(x) (((x)&0x03)<<2) -#define MCF_GPIO_PAR_FECI2C_PAR_MDIO(x) (((x)&0x03)<<4) -#define MCF_GPIO_PAR_FECI2C_PAR_MDC(x) (((x)&0x03)<<6) -#define MCF_GPIO_PAR_FECI2C_PAR_MDC_GPIO (0x00) -#define MCF_GPIO_PAR_FECI2C_PAR_MDC_UTXD2 (0x40) -#define MCF_GPIO_PAR_FECI2C_PAR_MDC_SCL (0x80) -#define MCF_GPIO_PAR_FECI2C_PAR_MDC_EMDC (0xC0) -#define MCF_GPIO_PAR_FECI2C_PAR_MDIO_GPIO (0x00) -#define MCF_GPIO_PAR_FECI2C_PAR_MDIO_URXD2 (0x10) -#define MCF_GPIO_PAR_FECI2C_PAR_MDIO_SDA (0x20) -#define MCF_GPIO_PAR_FECI2C_PAR_MDIO_EMDIO (0x30) -#define MCF_GPIO_PAR_FECI2C_PAR_SCL_GPIO (0x00) -#define MCF_GPIO_PAR_FECI2C_PAR_SCL_UTXD2 (0x04) -#define MCF_GPIO_PAR_FECI2C_PAR_SCL_SCL (0x0C) -#define MCF_GPIO_PAR_FECI2C_PAR_SDA_GPIO (0x00) -#define MCF_GPIO_PAR_FECI2C_PAR_SDA_URXD2 (0x02) -#define MCF_GPIO_PAR_FECI2C_PAR_SDA_SDA (0x03) - -/* Bit definitions and macros for MCF_GPIO_PAR_BE */ -#define MCF_GPIO_PAR_BE_PAR_BE0 (0x01) -#define MCF_GPIO_PAR_BE_PAR_BE1 (0x02) -#define MCF_GPIO_PAR_BE_PAR_BE2 (0x04) -#define MCF_GPIO_PAR_BE_PAR_BE3 (0x08) - -/* Bit definitions and macros for MCF_GPIO_PAR_CS */ -#define MCF_GPIO_PAR_CS_PAR_CS1 (0x02) -#define MCF_GPIO_PAR_CS_PAR_CS2 (0x04) -#define MCF_GPIO_PAR_CS_PAR_CS3 (0x08) -#define MCF_GPIO_PAR_CS_PAR_CS4 (0x10) -#define MCF_GPIO_PAR_CS_PAR_CS5 (0x20) -#define MCF_GPIO_PAR_CS_PAR_CS_CS1_GPIO (0x00) -#define MCF_GPIO_PAR_CS_PAR_CS_CS1_SDCS1 (0x01) -#define MCF_GPIO_PAR_CS_PAR_CS_CS1_CS1 (0x03) - -/* Bit definitions and macros for MCF_GPIO_PAR_SSI */ -#define MCF_GPIO_PAR_SSI_PAR_MCLK (0x0080) -#define MCF_GPIO_PAR_SSI_PAR_TXD(x) (((x)&0x0003)<<8) -#define MCF_GPIO_PAR_SSI_PAR_RXD(x) (((x)&0x0003)<<10) -#define MCF_GPIO_PAR_SSI_PAR_FS(x) (((x)&0x0003)<<12) -#define MCF_GPIO_PAR_SSI_PAR_BCLK(x) (((x)&0x0003)<<14) - -/* Bit definitions and macros for MCF_GPIO_PAR_UART */ -#define MCF_GPIO_PAR_UART_PAR_UTXD0 (0x0001) -#define MCF_GPIO_PAR_UART_PAR_URXD0 (0x0002) -#define MCF_GPIO_PAR_UART_PAR_URTS0 (0x0004) -#define MCF_GPIO_PAR_UART_PAR_UCTS0 (0x0008) -#define MCF_GPIO_PAR_UART_PAR_UTXD1(x) (((x)&0x0003)<<4) -#define MCF_GPIO_PAR_UART_PAR_URXD1(x) (((x)&0x0003)<<6) -#define MCF_GPIO_PAR_UART_PAR_URTS1(x) (((x)&0x0003)<<8) -#define MCF_GPIO_PAR_UART_PAR_UCTS1(x) (((x)&0x0003)<<10) -#define MCF_GPIO_PAR_UART_PAR_UCTS1_GPIO (0x0000) -#define MCF_GPIO_PAR_UART_PAR_UCTS1_SSI_BCLK (0x0800) -#define MCF_GPIO_PAR_UART_PAR_UCTS1_ULPI_D7 (0x0400) -#define MCF_GPIO_PAR_UART_PAR_UCTS1_UCTS1 (0x0C00) -#define MCF_GPIO_PAR_UART_PAR_URTS1_GPIO (0x0000) -#define MCF_GPIO_PAR_UART_PAR_URTS1_SSI_FS (0x0200) -#define MCF_GPIO_PAR_UART_PAR_URTS1_ULPI_D6 (0x0100) -#define MCF_GPIO_PAR_UART_PAR_URTS1_URTS1 (0x0300) -#define MCF_GPIO_PAR_UART_PAR_URXD1_GPIO (0x0000) -#define MCF_GPIO_PAR_UART_PAR_URXD1_SSI_RXD (0x0080) -#define MCF_GPIO_PAR_UART_PAR_URXD1_ULPI_D5 (0x0040) -#define MCF_GPIO_PAR_UART_PAR_URXD1_URXD1 (0x00C0) -#define MCF_GPIO_PAR_UART_PAR_UTXD1_GPIO (0x0000) -#define MCF_GPIO_PAR_UART_PAR_UTXD1_SSI_TXD (0x0020) -#define MCF_GPIO_PAR_UART_PAR_UTXD1_ULPI_D4 (0x0010) -#define MCF_GPIO_PAR_UART_PAR_UTXD1_UTXD1 (0x0030) - -/* Bit definitions and macros for MCF_GPIO_PAR_QSPI */ -#define MCF_GPIO_PAR_QSPI_PAR_SCK(x) (((x)&0x0003)<<4) -#define MCF_GPIO_PAR_QSPI_PAR_DOUT(x) (((x)&0x0003)<<6) -#define MCF_GPIO_PAR_QSPI_PAR_DIN(x) (((x)&0x0003)<<8) -#define MCF_GPIO_PAR_QSPI_PAR_PCS0(x) (((x)&0x0003)<<10) -#define MCF_GPIO_PAR_QSPI_PAR_PCS1(x) (((x)&0x0003)<<12) -#define MCF_GPIO_PAR_QSPI_PAR_PCS2(x) (((x)&0x0003)<<14) - -/* Bit definitions and macros for MCF_GPIO_PAR_TIMER */ -#define MCF_GPIO_PAR_TIMER_PAR_TIN0(x) (((x)&0x03)<<0) -#define MCF_GPIO_PAR_TIMER_PAR_TIN1(x) (((x)&0x03)<<2) -#define MCF_GPIO_PAR_TIMER_PAR_TIN2(x) (((x)&0x03)<<4) -#define MCF_GPIO_PAR_TIMER_PAR_TIN3(x) (((x)&0x03)<<6) -#define MCF_GPIO_PAR_TIMER_PAR_TIN3_GPIO (0x00) -#define MCF_GPIO_PAR_TIMER_PAR_TIN3_TOUT3 (0x80) -#define MCF_GPIO_PAR_TIMER_PAR_TIN3_URXD2 (0x40) -#define MCF_GPIO_PAR_TIMER_PAR_TIN3_TIN3 (0xC0) -#define MCF_GPIO_PAR_TIMER_PAR_TIN2_GPIO (0x00) -#define MCF_GPIO_PAR_TIMER_PAR_TIN2_TOUT2 (0x20) -#define MCF_GPIO_PAR_TIMER_PAR_TIN2_UTXD2 (0x10) -#define MCF_GPIO_PAR_TIMER_PAR_TIN2_TIN2 (0x30) -#define MCF_GPIO_PAR_TIMER_PAR_TIN1_GPIO (0x00) -#define MCF_GPIO_PAR_TIMER_PAR_TIN1_TOUT1 (0x08) -#define MCF_GPIO_PAR_TIMER_PAR_TIN1_DACK1 (0x04) -#define MCF_GPIO_PAR_TIMER_PAR_TIN1_TIN1 (0x0C) -#define MCF_GPIO_PAR_TIMER_PAR_TIN0_GPIO (0x00) -#define MCF_GPIO_PAR_TIMER_PAR_TIN0_TOUT0 (0x02) -#define MCF_GPIO_PAR_TIMER_PAR_TIN0_DREQ0 (0x01) -#define MCF_GPIO_PAR_TIMER_PAR_TIN0_TIN0 (0x03) - -/* Bit definitions and macros for MCF_GPIO_PAR_LCDDATA */ -#define MCF_GPIO_PAR_LCDDATA_PAR_LD7_0(x) (((x)&0x03)<<0) -#define MCF_GPIO_PAR_LCDDATA_PAR_LD15_8(x) (((x)&0x03)<<2) -#define MCF_GPIO_PAR_LCDDATA_PAR_LD16(x) (((x)&0x03)<<4) -#define MCF_GPIO_PAR_LCDDATA_PAR_LD17(x) (((x)&0x03)<<6) - -/* Bit definitions and macros for MCF_GPIO_PAR_LCDCTL */ -#define MCF_GPIO_PAR_LCDCTL_PAR_CLS (0x0001) -#define MCF_GPIO_PAR_LCDCTL_PAR_PS (0x0002) -#define MCF_GPIO_PAR_LCDCTL_PAR_REV (0x0004) -#define MCF_GPIO_PAR_LCDCTL_PAR_SPL_SPR (0x0008) -#define MCF_GPIO_PAR_LCDCTL_PAR_CONTRAST (0x0010) -#define MCF_GPIO_PAR_LCDCTL_PAR_LSCLK (0x0020) -#define MCF_GPIO_PAR_LCDCTL_PAR_LP_HSYNC (0x0040) -#define MCF_GPIO_PAR_LCDCTL_PAR_FLM_VSYNC (0x0080) -#define MCF_GPIO_PAR_LCDCTL_PAR_ACD_OE (0x0100) - -/* Bit definitions and macros for MCF_GPIO_PAR_IRQ */ -#define MCF_GPIO_PAR_IRQ_PAR_IRQ1(x) (((x)&0x0003)<<4) -#define MCF_GPIO_PAR_IRQ_PAR_IRQ2(x) (((x)&0x0003)<<6) -#define MCF_GPIO_PAR_IRQ_PAR_IRQ4(x) (((x)&0x0003)<<8) -#define MCF_GPIO_PAR_IRQ_PAR_IRQ5(x) (((x)&0x0003)<<10) -#define MCF_GPIO_PAR_IRQ_PAR_IRQ6(x) (((x)&0x0003)<<12) - -/* Bit definitions and macros for MCF_GPIO_MSCR_FLEXBUS */ -#define MCF_GPIO_MSCR_FLEXBUS_MSCR_ADDRCTL(x) (((x)&0x03)<<0) -#define MCF_GPIO_MSCR_FLEXBUS_MSCR_DLOWER(x) (((x)&0x03)<<2) -#define MCF_GPIO_MSCR_FLEXBUS_MSCR_DUPPER(x) (((x)&0x03)<<4) - -/* Bit definitions and macros for MCF_GPIO_MSCR_SDRAM */ -#define MCF_GPIO_MSCR_SDRAM_MSCR_SDRAM(x) (((x)&0x03)<<0) -#define MCF_GPIO_MSCR_SDRAM_MSCR_SDCLK(x) (((x)&0x03)<<2) -#define MCF_GPIO_MSCR_SDRAM_MSCR_SDCLKB(x) (((x)&0x03)<<4) - -/* Bit definitions and macros for MCF_GPIO_DSCR_I2C */ -#define MCF_GPIO_DSCR_I2C_I2C_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_PWM */ -#define MCF_GPIO_DSCR_PWM_PWM_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_FEC */ -#define MCF_GPIO_DSCR_FEC_FEC_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_UART */ -#define MCF_GPIO_DSCR_UART_UART0_DSE(x) (((x)&0x03)<<0) -#define MCF_GPIO_DSCR_UART_UART1_DSE(x) (((x)&0x03)<<2) - -/* Bit definitions and macros for MCF_GPIO_DSCR_QSPI */ -#define MCF_GPIO_DSCR_QSPI_QSPI_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_TIMER */ -#define MCF_GPIO_DSCR_TIMER_TIMER_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_SSI */ -#define MCF_GPIO_DSCR_SSI_SSI_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_LCD */ -#define MCF_GPIO_DSCR_LCD_LCD_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_DEBUG */ -#define MCF_GPIO_DSCR_DEBUG_DEBUG_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_CLKRST */ -#define MCF_GPIO_DSCR_CLKRST_CLKRST_DSE(x) (((x)&0x03)<<0) - -/* Bit definitions and macros for MCF_GPIO_DSCR_IRQ */ -#define MCF_GPIO_DSCR_IRQ_IRQ_DSE(x) (((x)&0x03)<<0) - -/* - * Generic GPIO support - */ -#define MCFGPIO_PODR MCFGPIO_PODR_FECH -#define MCFGPIO_PDDR MCFGPIO_PDDR_FECH -#define MCFGPIO_PPDR MCFGPIO_PPDSDR_FECH -#define MCFGPIO_SETR MCFGPIO_PPDSDR_FECH -#define MCFGPIO_CLRR MCFGPIO_PCLRR_FECH - -#define MCFGPIO_PIN_MAX 136 -#define MCFGPIO_IRQ_MAX 8 -#define MCFGPIO_IRQ_VECBASE MCFINT_VECBASE - -/********************************************************************* - * - * Phase Locked Loop (PLL) - * - *********************************************************************/ - -/* Register read/write macros */ -#define MCF_PLL_PODR 0xFC0C0000 -#define MCF_PLL_PLLCR 0xFC0C0004 -#define MCF_PLL_PMDR 0xFC0C0008 -#define MCF_PLL_PFDR 0xFC0C000C - -/* Bit definitions and macros for MCF_PLL_PODR */ -#define MCF_PLL_PODR_BUSDIV(x) (((x)&0x0F)<<0) -#define MCF_PLL_PODR_CPUDIV(x) (((x)&0x0F)<<4) - -/* Bit definitions and macros for MCF_PLL_PLLCR */ -#define MCF_PLL_PLLCR_DITHDEV(x) (((x)&0x07)<<0) -#define MCF_PLL_PLLCR_DITHEN (0x80) - -/* Bit definitions and macros for MCF_PLL_PMDR */ -#define MCF_PLL_PMDR_MODDIV(x) (((x)&0xFF)<<0) - -/* Bit definitions and macros for MCF_PLL_PFDR */ -#define MCF_PLL_PFDR_MFD(x) (((x)&0xFF)<<0) - -/********************************************************************* - * - * System Control Module Registers (SCM) - * - *********************************************************************/ - -/* Register read/write macros */ -#define MCF_SCM_MPR 0xFC000000 -#define MCF_SCM_PACRA 0xFC000020 -#define MCF_SCM_PACRB 0xFC000024 -#define MCF_SCM_PACRC 0xFC000028 -#define MCF_SCM_PACRD 0xFC00002C -#define MCF_SCM_PACRE 0xFC000040 -#define MCF_SCM_PACRF 0xFC000044 - -#define MCF_SCM_BCR 0xFC040024 - -/********************************************************************* - * - * SDRAM Controller (SDRAMC) - * - *********************************************************************/ - -/* Register read/write macros */ -#define MCF_SDRAMC_SDMR 0xFC0B8000 -#define MCF_SDRAMC_SDCR 0xFC0B8004 -#define MCF_SDRAMC_SDCFG1 0xFC0B8008 -#define MCF_SDRAMC_SDCFG2 0xFC0B800C -#define MCF_SDRAMC_LIMP_FIX 0xFC0B8080 -#define MCF_SDRAMC_SDDS 0xFC0B8100 -#define MCF_SDRAMC_SDCS0 0xFC0B8110 -#define MCF_SDRAMC_SDCS1 0xFC0B8114 -#define MCF_SDRAMC_SDCS2 0xFC0B8118 -#define MCF_SDRAMC_SDCS3 0xFC0B811C - -/* Bit definitions and macros for MCF_SDRAMC_SDMR */ -#define MCF_SDRAMC_SDMR_CMD (0x00010000) -#define MCF_SDRAMC_SDMR_AD(x) (((x)&0x00000FFF)<<18) -#define MCF_SDRAMC_SDMR_BNKAD(x) (((x)&0x00000003)<<30) -#define MCF_SDRAMC_SDMR_BNKAD_LMR (0x00000000) -#define MCF_SDRAMC_SDMR_BNKAD_LEMR (0x40000000) - -/* Bit definitions and macros for MCF_SDRAMC_SDCR */ -#define MCF_SDRAMC_SDCR_IPALL (0x00000002) -#define MCF_SDRAMC_SDCR_IREF (0x00000004) -#define MCF_SDRAMC_SDCR_DQS_OE(x) (((x)&0x0000000F)<<8) -#define MCF_SDRAMC_SDCR_PS(x) (((x)&0x00000003)<<12) -#define MCF_SDRAMC_SDCR_RCNT(x) (((x)&0x0000003F)<<16) -#define MCF_SDRAMC_SDCR_OE_RULE (0x00400000) -#define MCF_SDRAMC_SDCR_MUX(x) (((x)&0x00000003)<<24) -#define MCF_SDRAMC_SDCR_REF (0x10000000) -#define MCF_SDRAMC_SDCR_DDR (0x20000000) -#define MCF_SDRAMC_SDCR_CKE (0x40000000) -#define MCF_SDRAMC_SDCR_MODE_EN (0x80000000) -#define MCF_SDRAMC_SDCR_PS_16 (0x00002000) -#define MCF_SDRAMC_SDCR_PS_32 (0x00000000) - -/* Bit definitions and macros for MCF_SDRAMC_SDCFG1 */ -#define MCF_SDRAMC_SDCFG1_WTLAT(x) (((x)&0x00000007)<<4) -#define MCF_SDRAMC_SDCFG1_REF2ACT(x) (((x)&0x0000000F)<<8) -#define MCF_SDRAMC_SDCFG1_PRE2ACT(x) (((x)&0x00000007)<<12) -#define MCF_SDRAMC_SDCFG1_ACT2RW(x) (((x)&0x00000007)<<16) -#define MCF_SDRAMC_SDCFG1_RDLAT(x) (((x)&0x0000000F)<<20) -#define MCF_SDRAMC_SDCFG1_SWT2RD(x) (((x)&0x00000007)<<24) -#define MCF_SDRAMC_SDCFG1_SRD2RW(x) (((x)&0x0000000F)<<28) - -/* Bit definitions and macros for MCF_SDRAMC_SDCFG2 */ -#define MCF_SDRAMC_SDCFG2_BL(x) (((x)&0x0000000F)<<16) -#define MCF_SDRAMC_SDCFG2_BRD2WT(x) (((x)&0x0000000F)<<20) -#define MCF_SDRAMC_SDCFG2_BWT2RW(x) (((x)&0x0000000F)<<24) -#define MCF_SDRAMC_SDCFG2_BRD2PRE(x) (((x)&0x0000000F)<<28) - -/* Device Errata - LIMP mode work around */ -#define MCF_SDRAMC_REFRESH (0x40000000) - -/* Bit definitions and macros for MCF_SDRAMC_SDDS */ -#define MCF_SDRAMC_SDDS_SB_D(x) (((x)&0x00000003)<<0) -#define MCF_SDRAMC_SDDS_SB_S(x) (((x)&0x00000003)<<2) -#define MCF_SDRAMC_SDDS_SB_A(x) (((x)&0x00000003)<<4) -#define MCF_SDRAMC_SDDS_SB_C(x) (((x)&0x00000003)<<6) -#define MCF_SDRAMC_SDDS_SB_E(x) (((x)&0x00000003)<<8) - -/* Bit definitions and macros for MCF_SDRAMC_SDCS */ -#define MCF_SDRAMC_SDCS_CSSZ(x) (((x)&0x0000001F)<<0) -#define MCF_SDRAMC_SDCS_BASE(x) (((x)&0x00000FFF)<<20) -#define MCF_SDRAMC_SDCS_BA(x) ((x)&0xFFF00000) -#define MCF_SDRAMC_SDCS_CSSZ_DIABLE (0x00000000) -#define MCF_SDRAMC_SDCS_CSSZ_1MBYTE (0x00000013) -#define MCF_SDRAMC_SDCS_CSSZ_2MBYTE (0x00000014) -#define MCF_SDRAMC_SDCS_CSSZ_4MBYTE (0x00000015) -#define MCF_SDRAMC_SDCS_CSSZ_8MBYTE (0x00000016) -#define MCF_SDRAMC_SDCS_CSSZ_16MBYTE (0x00000017) -#define MCF_SDRAMC_SDCS_CSSZ_32MBYTE (0x00000018) -#define MCF_SDRAMC_SDCS_CSSZ_64MBYTE (0x00000019) -#define MCF_SDRAMC_SDCS_CSSZ_128MBYTE (0x0000001A) -#define MCF_SDRAMC_SDCS_CSSZ_256MBYTE (0x0000001B) -#define MCF_SDRAMC_SDCS_CSSZ_512MBYTE (0x0000001C) -#define MCF_SDRAMC_SDCS_CSSZ_1GBYTE (0x0000001D) -#define MCF_SDRAMC_SDCS_CSSZ_2GBYTE (0x0000001E) -#define MCF_SDRAMC_SDCS_CSSZ_4GBYTE (0x0000001F) - -/* - * Edge Port Module (EPORT) - */ -#define MCFEPORT_EPPAR (0xFC094000) -#define MCFEPORT_EPDDR (0xFC094002) -#define MCFEPORT_EPIER (0xFC094003) -#define MCFEPORT_EPDR (0xFC094004) -#define MCFEPORT_EPPDR (0xFC094005) -#define MCFEPORT_EPFR (0xFC094006) - -/********************************************************************/ -#endif /* m532xsim_h */ diff --git a/arch/m68k/include/asm/m53xxacr.h b/arch/m68k/include/asm/m53xxacr.h index cd952b0a8bd3..3177ce8331d6 100644 --- a/arch/m68k/include/asm/m53xxacr.h +++ b/arch/m68k/include/asm/m53xxacr.h @@ -55,8 +55,8 @@ #define CACHE_SIZE 0x2000 /* 8k of unified cache */ #define ICACHE_SIZE CACHE_SIZE #define DCACHE_SIZE CACHE_SIZE -#elif defined(CONFIG_M532x) -#define CACHE_SIZE 0x4000 /* 32k of unified cache */ +#elif defined(CONFIG_M53xx) +#define CACHE_SIZE 0x4000 /* 16k of unified cache */ #define ICACHE_SIZE CACHE_SIZE #define DCACHE_SIZE CACHE_SIZE #endif diff --git a/arch/m68k/include/asm/m53xxsim.h b/arch/m68k/include/asm/m53xxsim.h new file mode 100644 index 000000000000..faa1a2133bfd --- /dev/null +++ b/arch/m68k/include/asm/m53xxsim.h @@ -0,0 +1,1241 @@ +/****************************************************************************/ + +/* + * m53xxsim.h -- ColdFire 5329 registers + */ + +/****************************************************************************/ +#ifndef m53xxsim_h +#define m53xxsim_h +/****************************************************************************/ + +#define CPU_NAME "COLDFIRE(m53xx)" +#define CPU_INSTR_PER_JIFFY 3 +#define MCF_BUSCLK (MCF_CLK / 3) + +#include + +#define MCFINT_VECBASE 64 +#define MCFINT_UART0 26 /* Interrupt number for UART0 */ +#define MCFINT_UART1 27 /* Interrupt number for UART1 */ +#define MCFINT_UART2 28 /* Interrupt number for UART2 */ +#define MCFINT_QSPI 31 /* Interrupt number for QSPI */ +#define MCFINT_FECRX0 36 /* Interrupt number for FEC */ +#define MCFINT_FECTX0 40 /* Interrupt number for FEC */ +#define MCFINT_FECENTC0 42 /* Interrupt number for FEC */ + +#define MCF_IRQ_UART0 (MCFINT_VECBASE + MCFINT_UART0) +#define MCF_IRQ_UART1 (MCFINT_VECBASE + MCFINT_UART1) +#define MCF_IRQ_UART2 (MCFINT_VECBASE + MCFINT_UART2) + +#define MCF_IRQ_FECRX0 (MCFINT_VECBASE + MCFINT_FECRX0) +#define MCF_IRQ_FECTX0 (MCFINT_VECBASE + MCFINT_FECTX0) +#define MCF_IRQ_FECENTC0 (MCFINT_VECBASE + MCFINT_FECENTC0) + +#define MCF_IRQ_QSPI (MCFINT_VECBASE + MCFINT_QSPI) + +#define MCF_WTM_WCR 0xFC098000 + +/* + * Define the 532x SIM register set addresses. + */ +#define MCFSIM_IPRL 0xFC048004 +#define MCFSIM_IPRH 0xFC048000 +#define MCFSIM_IPR MCFSIM_IPRL +#define MCFSIM_IMRL 0xFC04800C +#define MCFSIM_IMRH 0xFC048008 +#define MCFSIM_IMR MCFSIM_IMRL +#define MCFSIM_ICR0 0xFC048040 +#define MCFSIM_ICR1 0xFC048041 +#define MCFSIM_ICR2 0xFC048042 +#define MCFSIM_ICR3 0xFC048043 +#define MCFSIM_ICR4 0xFC048044 +#define MCFSIM_ICR5 0xFC048045 +#define MCFSIM_ICR6 0xFC048046 +#define MCFSIM_ICR7 0xFC048047 +#define MCFSIM_ICR8 0xFC048048 +#define MCFSIM_ICR9 0xFC048049 +#define MCFSIM_ICR10 0xFC04804A +#define MCFSIM_ICR11 0xFC04804B + +/* + * Some symbol defines for the above... + */ +#define MCFSIM_SWDICR MCFSIM_ICR0 /* Watchdog timer ICR */ +#define MCFSIM_TIMER1ICR MCFSIM_ICR1 /* Timer 1 ICR */ +#define MCFSIM_TIMER2ICR MCFSIM_ICR2 /* Timer 2 ICR */ +#define MCFSIM_UART1ICR MCFSIM_ICR4 /* UART 1 ICR */ +#define MCFSIM_UART2ICR MCFSIM_ICR5 /* UART 2 ICR */ +#define MCFSIM_DMA0ICR MCFSIM_ICR6 /* DMA 0 ICR */ +#define MCFSIM_DMA1ICR MCFSIM_ICR7 /* DMA 1 ICR */ +#define MCFSIM_DMA2ICR MCFSIM_ICR8 /* DMA 2 ICR */ +#define MCFSIM_DMA3ICR MCFSIM_ICR9 /* DMA 3 ICR */ + + +#define MCFINTC0_SIMR 0xFC04801C +#define MCFINTC0_CIMR 0xFC04801D +#define MCFINTC0_ICR0 0xFC048040 +#define MCFINTC1_SIMR 0xFC04C01C +#define MCFINTC1_CIMR 0xFC04C01D +#define MCFINTC1_ICR0 0xFC04C040 +#define MCFINTC2_SIMR (0) +#define MCFINTC2_CIMR (0) +#define MCFINTC2_ICR0 (0) + +#define MCFSIM_ICR_TIMER1 (0xFC048040+32) +#define MCFSIM_ICR_TIMER2 (0xFC048040+33) + +/* + * Define system peripheral IRQ usage. + */ +#define MCF_IRQ_TIMER (64 + 32) /* Timer0 */ +#define MCF_IRQ_PROFILER (64 + 33) /* Timer1 */ + +/* + * UART module. + */ +#define MCFUART_BASE0 0xFC060000 /* Base address of UART1 */ +#define MCFUART_BASE1 0xFC064000 /* Base address of UART2 */ +#define MCFUART_BASE2 0xFC068000 /* Base address of UART3 */ + +/* + * FEC module. + */ +#define MCFFEC_BASE0 0xFC030000 /* Base address of FEC0 */ +#define MCFFEC_SIZE0 0x800 /* Size of FEC0 region */ + +/* + * QSPI module. + */ +#define MCFQSPI_BASE 0xFC05C000 /* Base address of QSPI */ +#define MCFQSPI_SIZE 0x40 /* Size of QSPI region */ + +#define MCFQSPI_CS0 84 +#define MCFQSPI_CS1 85 +#define MCFQSPI_CS2 86 + +/* + * Timer module. + */ +#define MCFTIMER_BASE1 0xFC070000 /* Base address of TIMER1 */ +#define MCFTIMER_BASE2 0xFC074000 /* Base address of TIMER2 */ +#define MCFTIMER_BASE3 0xFC078000 /* Base address of TIMER3 */ +#define MCFTIMER_BASE4 0xFC07C000 /* Base address of TIMER4 */ + +/********************************************************************* + * + * Reset Controller Module + * + *********************************************************************/ + +#define MCF_RCR 0xFC0A0000 +#define MCF_RSR 0xFC0A0001 + +#define MCF_RCR_SWRESET 0x80 /* Software reset bit */ +#define MCF_RCR_FRCSTOUT 0x40 /* Force external reset */ + + +/* + * Power Management + */ +#define MCFPM_WCR 0xfc040013 +#define MCFPM_PPMSR0 0xfc04002c +#define MCFPM_PPMCR0 0xfc04002d +#define MCFPM_PPMSR1 0xfc04002e +#define MCFPM_PPMCR1 0xfc04002f +#define MCFPM_PPMHR0 0xfc040030 +#define MCFPM_PPMLR0 0xfc040034 +#define MCFPM_PPMHR1 0xfc040038 +#define MCFPM_LPCR 0xec090007 + +/* + * The M5329EVB board needs a help getting its devices initialized + * at kernel start time if dBUG doesn't set it up (for example + * it is not used), so we need to do it manually. + */ +#ifdef __ASSEMBLER__ +.macro m5329EVB_setup + movel #0xFC098000, %a7 + movel #0x0, (%a7) +#define CORE_SRAM 0x80000000 +#define CORE_SRAM_SIZE 0x8000 + movel #CORE_SRAM, %d0 + addl #0x221, %d0 + movec %d0,%RAMBAR1 + movel #CORE_SRAM, %sp + addl #CORE_SRAM_SIZE, %sp + jsr sysinit +.endm +#define PLATFORM_SETUP m5329EVB_setup + +#endif /* __ASSEMBLER__ */ + +/********************************************************************* + * + * Chip Configuration Module (CCM) + * + *********************************************************************/ + +/* Register read/write macros */ +#define MCF_CCM_CCR 0xFC0A0004 +#define MCF_CCM_RCON 0xFC0A0008 +#define MCF_CCM_CIR 0xFC0A000A +#define MCF_CCM_MISCCR 0xFC0A0010 +#define MCF_CCM_CDR 0xFC0A0012 +#define MCF_CCM_UHCSR 0xFC0A0014 +#define MCF_CCM_UOCSR 0xFC0A0016 + +/* Bit definitions and macros for MCF_CCM_CCR */ +#define MCF_CCM_CCR_RESERVED (0x0001) +#define MCF_CCM_CCR_PLL_MODE (0x0003) +#define MCF_CCM_CCR_OSC_MODE (0x0005) +#define MCF_CCM_CCR_BOOTPS(x) (((x)&0x0003)<<3|0x0001) +#define MCF_CCM_CCR_LOAD (0x0021) +#define MCF_CCM_CCR_LIMP (0x0041) +#define MCF_CCM_CCR_CSC(x) (((x)&0x0003)<<8|0x0001) + +/* Bit definitions and macros for MCF_CCM_RCON */ +#define MCF_CCM_RCON_RESERVED (0x0001) +#define MCF_CCM_RCON_PLL_MODE (0x0003) +#define MCF_CCM_RCON_OSC_MODE (0x0005) +#define MCF_CCM_RCON_BOOTPS(x) (((x)&0x0003)<<3|0x0001) +#define MCF_CCM_RCON_LOAD (0x0021) +#define MCF_CCM_RCON_LIMP (0x0041) +#define MCF_CCM_RCON_CSC(x) (((x)&0x0003)<<8|0x0001) + +/* Bit definitions and macros for MCF_CCM_CIR */ +#define MCF_CCM_CIR_PRN(x) (((x)&0x003F)<<0) +#define MCF_CCM_CIR_PIN(x) (((x)&0x03FF)<<6) + +/* Bit definitions and macros for MCF_CCM_MISCCR */ +#define MCF_CCM_MISCCR_USBSRC (0x0001) +#define MCF_CCM_MISCCR_USBDIV (0x0002) +#define MCF_CCM_MISCCR_SSI_SRC (0x0010) +#define MCF_CCM_MISCCR_TIM_DMA (0x0020) +#define MCF_CCM_MISCCR_SSI_PUS (0x0040) +#define MCF_CCM_MISCCR_SSI_PUE (0x0080) +#define MCF_CCM_MISCCR_LCD_CHEN (0x0100) +#define MCF_CCM_MISCCR_LIMP (0x1000) +#define MCF_CCM_MISCCR_PLL_LOCK (0x2000) + +/* Bit definitions and macros for MCF_CCM_CDR */ +#define MCF_CCM_CDR_SSIDIV(x) (((x)&0x000F)<<0) +#define MCF_CCM_CDR_LPDIV(x) (((x)&0x000F)<<8) + +/* Bit definitions and macros for MCF_CCM_UHCSR */ +#define MCF_CCM_UHCSR_XPDE (0x0001) +#define MCF_CCM_UHCSR_UHMIE (0x0002) +#define MCF_CCM_UHCSR_WKUP (0x0004) +#define MCF_CCM_UHCSR_PORTIND(x) (((x)&0x0003)<<14) + +/* Bit definitions and macros for MCF_CCM_UOCSR */ +#define MCF_CCM_UOCSR_XPDE (0x0001) +#define MCF_CCM_UOCSR_UOMIE (0x0002) +#define MCF_CCM_UOCSR_WKUP (0x0004) +#define MCF_CCM_UOCSR_PWRFLT (0x0008) +#define MCF_CCM_UOCSR_SEND (0x0010) +#define MCF_CCM_UOCSR_VVLD (0x0020) +#define MCF_CCM_UOCSR_BVLD (0x0040) +#define MCF_CCM_UOCSR_AVLD (0x0080) +#define MCF_CCM_UOCSR_DPPU (0x0100) +#define MCF_CCM_UOCSR_DCR_VBUS (0x0200) +#define MCF_CCM_UOCSR_CRG_VBUS (0x0400) +#define MCF_CCM_UOCSR_DRV_VBUS (0x0800) +#define MCF_CCM_UOCSR_DMPD (0x1000) +#define MCF_CCM_UOCSR_DPPD (0x2000) +#define MCF_CCM_UOCSR_PORTIND(x) (((x)&0x0003)<<14) + +/********************************************************************* + * + * FlexBus Chip Selects (FBCS) + * + *********************************************************************/ + +/* Register read/write macros */ +#define MCF_FBCS0_CSAR 0xFC008000 +#define MCF_FBCS0_CSMR 0xFC008004 +#define MCF_FBCS0_CSCR 0xFC008008 +#define MCF_FBCS1_CSAR 0xFC00800C +#define MCF_FBCS1_CSMR 0xFC008010 +#define MCF_FBCS1_CSCR 0xFC008014 +#define MCF_FBCS2_CSAR 0xFC008018 +#define MCF_FBCS2_CSMR 0xFC00801C +#define MCF_FBCS2_CSCR 0xFC008020 +#define MCF_FBCS3_CSAR 0xFC008024 +#define MCF_FBCS3_CSMR 0xFC008028 +#define MCF_FBCS3_CSCR 0xFC00802C +#define MCF_FBCS4_CSAR 0xFC008030 +#define MCF_FBCS4_CSMR 0xFC008034 +#define MCF_FBCS4_CSCR 0xFC008038 +#define MCF_FBCS5_CSAR 0xFC00803C +#define MCF_FBCS5_CSMR 0xFC008040 +#define MCF_FBCS5_CSCR 0xFC008044 + +/* Bit definitions and macros for MCF_FBCS_CSAR */ +#define MCF_FBCS_CSAR_BA(x) ((x)&0xFFFF0000) + +/* Bit definitions and macros for MCF_FBCS_CSMR */ +#define MCF_FBCS_CSMR_V (0x00000001) +#define MCF_FBCS_CSMR_WP (0x00000100) +#define MCF_FBCS_CSMR_BAM(x) (((x)&0x0000FFFF)<<16) +#define MCF_FBCS_CSMR_BAM_4G (0xFFFF0000) +#define MCF_FBCS_CSMR_BAM_2G (0x7FFF0000) +#define MCF_FBCS_CSMR_BAM_1G (0x3FFF0000) +#define MCF_FBCS_CSMR_BAM_1024M (0x3FFF0000) +#define MCF_FBCS_CSMR_BAM_512M (0x1FFF0000) +#define MCF_FBCS_CSMR_BAM_256M (0x0FFF0000) +#define MCF_FBCS_CSMR_BAM_128M (0x07FF0000) +#define MCF_FBCS_CSMR_BAM_64M (0x03FF0000) +#define MCF_FBCS_CSMR_BAM_32M (0x01FF0000) +#define MCF_FBCS_CSMR_BAM_16M (0x00FF0000) +#define MCF_FBCS_CSMR_BAM_8M (0x007F0000) +#define MCF_FBCS_CSMR_BAM_4M (0x003F0000) +#define MCF_FBCS_CSMR_BAM_2M (0x001F0000) +#define MCF_FBCS_CSMR_BAM_1M (0x000F0000) +#define MCF_FBCS_CSMR_BAM_1024K (0x000F0000) +#define MCF_FBCS_CSMR_BAM_512K (0x00070000) +#define MCF_FBCS_CSMR_BAM_256K (0x00030000) +#define MCF_FBCS_CSMR_BAM_128K (0x00010000) +#define MCF_FBCS_CSMR_BAM_64K (0x00000000) + +/* Bit definitions and macros for MCF_FBCS_CSCR */ +#define MCF_FBCS_CSCR_BSTW (0x00000008) +#define MCF_FBCS_CSCR_BSTR (0x00000010) +#define MCF_FBCS_CSCR_BEM (0x00000020) +#define MCF_FBCS_CSCR_PS(x) (((x)&0x00000003)<<6) +#define MCF_FBCS_CSCR_AA (0x00000100) +#define MCF_FBCS_CSCR_SBM (0x00000200) +#define MCF_FBCS_CSCR_WS(x) (((x)&0x0000003F)<<10) +#define MCF_FBCS_CSCR_WRAH(x) (((x)&0x00000003)<<16) +#define MCF_FBCS_CSCR_RDAH(x) (((x)&0x00000003)<<18) +#define MCF_FBCS_CSCR_ASET(x) (((x)&0x00000003)<<20) +#define MCF_FBCS_CSCR_SWSEN (0x00800000) +#define MCF_FBCS_CSCR_SWS(x) (((x)&0x0000003F)<<26) +#define MCF_FBCS_CSCR_PS_8 (0x0040) +#define MCF_FBCS_CSCR_PS_16 (0x0080) +#define MCF_FBCS_CSCR_PS_32 (0x0000) + +/********************************************************************* + * + * General Purpose I/O (GPIO) + * + *********************************************************************/ + +/* Register read/write macros */ +#define MCFGPIO_PODR_FECH (0xFC0A4000) +#define MCFGPIO_PODR_FECL (0xFC0A4001) +#define MCFGPIO_PODR_SSI (0xFC0A4002) +#define MCFGPIO_PODR_BUSCTL (0xFC0A4003) +#define MCFGPIO_PODR_BE (0xFC0A4004) +#define MCFGPIO_PODR_CS (0xFC0A4005) +#define MCFGPIO_PODR_PWM (0xFC0A4006) +#define MCFGPIO_PODR_FECI2C (0xFC0A4007) +#define MCFGPIO_PODR_UART (0xFC0A4009) +#define MCFGPIO_PODR_QSPI (0xFC0A400A) +#define MCFGPIO_PODR_TIMER (0xFC0A400B) +#define MCFGPIO_PODR_LCDDATAH (0xFC0A400D) +#define MCFGPIO_PODR_LCDDATAM (0xFC0A400E) +#define MCFGPIO_PODR_LCDDATAL (0xFC0A400F) +#define MCFGPIO_PODR_LCDCTLH (0xFC0A4010) +#define MCFGPIO_PODR_LCDCTLL (0xFC0A4011) +#define MCFGPIO_PDDR_FECH (0xFC0A4014) +#define MCFGPIO_PDDR_FECL (0xFC0A4015) +#define MCFGPIO_PDDR_SSI (0xFC0A4016) +#define MCFGPIO_PDDR_BUSCTL (0xFC0A4017) +#define MCFGPIO_PDDR_BE (0xFC0A4018) +#define MCFGPIO_PDDR_CS (0xFC0A4019) +#define MCFGPIO_PDDR_PWM (0xFC0A401A) +#define MCFGPIO_PDDR_FECI2C (0xFC0A401B) +#define MCFGPIO_PDDR_UART (0xFC0A401C) +#define MCFGPIO_PDDR_QSPI (0xFC0A401E) +#define MCFGPIO_PDDR_TIMER (0xFC0A401F) +#define MCFGPIO_PDDR_LCDDATAH (0xFC0A4021) +#define MCFGPIO_PDDR_LCDDATAM (0xFC0A4022) +#define MCFGPIO_PDDR_LCDDATAL (0xFC0A4023) +#define MCFGPIO_PDDR_LCDCTLH (0xFC0A4024) +#define MCFGPIO_PDDR_LCDCTLL (0xFC0A4025) +#define MCFGPIO_PPDSDR_FECH (0xFC0A4028) +#define MCFGPIO_PPDSDR_FECL (0xFC0A4029) +#define MCFGPIO_PPDSDR_SSI (0xFC0A402A) +#define MCFGPIO_PPDSDR_BUSCTL (0xFC0A402B) +#define MCFGPIO_PPDSDR_BE (0xFC0A402C) +#define MCFGPIO_PPDSDR_CS (0xFC0A402D) +#define MCFGPIO_PPDSDR_PWM (0xFC0A402E) +#define MCFGPIO_PPDSDR_FECI2C (0xFC0A402F) +#define MCFGPIO_PPDSDR_UART (0xFC0A4031) +#define MCFGPIO_PPDSDR_QSPI (0xFC0A4032) +#define MCFGPIO_PPDSDR_TIMER (0xFC0A4033) +#define MCFGPIO_PPDSDR_LCDDATAH (0xFC0A4035) +#define MCFGPIO_PPDSDR_LCDDATAM (0xFC0A4036) +#define MCFGPIO_PPDSDR_LCDDATAL (0xFC0A4037) +#define MCFGPIO_PPDSDR_LCDCTLH (0xFC0A4038) +#define MCFGPIO_PPDSDR_LCDCTLL (0xFC0A4039) +#define MCFGPIO_PCLRR_FECH (0xFC0A403C) +#define MCFGPIO_PCLRR_FECL (0xFC0A403D) +#define MCFGPIO_PCLRR_SSI (0xFC0A403E) +#define MCFGPIO_PCLRR_BUSCTL (0xFC0A403F) +#define MCFGPIO_PCLRR_BE (0xFC0A4040) +#define MCFGPIO_PCLRR_CS (0xFC0A4041) +#define MCFGPIO_PCLRR_PWM (0xFC0A4042) +#define MCFGPIO_PCLRR_FECI2C (0xFC0A4043) +#define MCFGPIO_PCLRR_UART (0xFC0A4045) +#define MCFGPIO_PCLRR_QSPI (0xFC0A4046) +#define MCFGPIO_PCLRR_TIMER (0xFC0A4047) +#define MCFGPIO_PCLRR_LCDDATAH (0xFC0A4049) +#define MCFGPIO_PCLRR_LCDDATAM (0xFC0A404A) +#define MCFGPIO_PCLRR_LCDDATAL (0xFC0A404B) +#define MCFGPIO_PCLRR_LCDCTLH (0xFC0A404C) +#define MCFGPIO_PCLRR_LCDCTLL (0xFC0A404D) +#define MCFGPIO_PAR_FEC (0xFC0A4050) +#define MCFGPIO_PAR_PWM (0xFC0A4051) +#define MCFGPIO_PAR_BUSCTL (0xFC0A4052) +#define MCFGPIO_PAR_FECI2C (0xFC0A4053) +#define MCFGPIO_PAR_BE (0xFC0A4054) +#define MCFGPIO_PAR_CS (0xFC0A4055) +#define MCFGPIO_PAR_SSI (0xFC0A4056) +#define MCFGPIO_PAR_UART (0xFC0A4058) +#define MCFGPIO_PAR_QSPI (0xFC0A405A) +#define MCFGPIO_PAR_TIMER (0xFC0A405C) +#define MCFGPIO_PAR_LCDDATA (0xFC0A405D) +#define MCFGPIO_PAR_LCDCTL (0xFC0A405E) +#define MCFGPIO_PAR_IRQ (0xFC0A4060) +#define MCFGPIO_MSCR_FLEXBUS (0xFC0A4064) +#define MCFGPIO_MSCR_SDRAM (0xFC0A4065) +#define MCFGPIO_DSCR_I2C (0xFC0A4068) +#define MCFGPIO_DSCR_PWM (0xFC0A4069) +#define MCFGPIO_DSCR_FEC (0xFC0A406A) +#define MCFGPIO_DSCR_UART (0xFC0A406B) +#define MCFGPIO_DSCR_QSPI (0xFC0A406C) +#define MCFGPIO_DSCR_TIMER (0xFC0A406D) +#define MCFGPIO_DSCR_SSI (0xFC0A406E) +#define MCFGPIO_DSCR_LCD (0xFC0A406F) +#define MCFGPIO_DSCR_DEBUG (0xFC0A4070) +#define MCFGPIO_DSCR_CLKRST (0xFC0A4071) +#define MCFGPIO_DSCR_IRQ (0xFC0A4072) + +/* Bit definitions and macros for MCF_GPIO_PODR_FECH */ +#define MCF_GPIO_PODR_FECH_PODR_FECH0 (0x01) +#define MCF_GPIO_PODR_FECH_PODR_FECH1 (0x02) +#define MCF_GPIO_PODR_FECH_PODR_FECH2 (0x04) +#define MCF_GPIO_PODR_FECH_PODR_FECH3 (0x08) +#define MCF_GPIO_PODR_FECH_PODR_FECH4 (0x10) +#define MCF_GPIO_PODR_FECH_PODR_FECH5 (0x20) +#define MCF_GPIO_PODR_FECH_PODR_FECH6 (0x40) +#define MCF_GPIO_PODR_FECH_PODR_FECH7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PODR_FECL */ +#define MCF_GPIO_PODR_FECL_PODR_FECL0 (0x01) +#define MCF_GPIO_PODR_FECL_PODR_FECL1 (0x02) +#define MCF_GPIO_PODR_FECL_PODR_FECL2 (0x04) +#define MCF_GPIO_PODR_FECL_PODR_FECL3 (0x08) +#define MCF_GPIO_PODR_FECL_PODR_FECL4 (0x10) +#define MCF_GPIO_PODR_FECL_PODR_FECL5 (0x20) +#define MCF_GPIO_PODR_FECL_PODR_FECL6 (0x40) +#define MCF_GPIO_PODR_FECL_PODR_FECL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PODR_SSI */ +#define MCF_GPIO_PODR_SSI_PODR_SSI0 (0x01) +#define MCF_GPIO_PODR_SSI_PODR_SSI1 (0x02) +#define MCF_GPIO_PODR_SSI_PODR_SSI2 (0x04) +#define MCF_GPIO_PODR_SSI_PODR_SSI3 (0x08) +#define MCF_GPIO_PODR_SSI_PODR_SSI4 (0x10) + +/* Bit definitions and macros for MCF_GPIO_PODR_BUSCTL */ +#define MCF_GPIO_PODR_BUSCTL_POSDR_BUSCTL0 (0x01) +#define MCF_GPIO_PODR_BUSCTL_PODR_BUSCTL1 (0x02) +#define MCF_GPIO_PODR_BUSCTL_PODR_BUSCTL2 (0x04) +#define MCF_GPIO_PODR_BUSCTL_PODR_BUSCTL3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PODR_BE */ +#define MCF_GPIO_PODR_BE_PODR_BE0 (0x01) +#define MCF_GPIO_PODR_BE_PODR_BE1 (0x02) +#define MCF_GPIO_PODR_BE_PODR_BE2 (0x04) +#define MCF_GPIO_PODR_BE_PODR_BE3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PODR_CS */ +#define MCF_GPIO_PODR_CS_PODR_CS1 (0x02) +#define MCF_GPIO_PODR_CS_PODR_CS2 (0x04) +#define MCF_GPIO_PODR_CS_PODR_CS3 (0x08) +#define MCF_GPIO_PODR_CS_PODR_CS4 (0x10) +#define MCF_GPIO_PODR_CS_PODR_CS5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PODR_PWM */ +#define MCF_GPIO_PODR_PWM_PODR_PWM2 (0x04) +#define MCF_GPIO_PODR_PWM_PODR_PWM3 (0x08) +#define MCF_GPIO_PODR_PWM_PODR_PWM4 (0x10) +#define MCF_GPIO_PODR_PWM_PODR_PWM5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PODR_FECI2C */ +#define MCF_GPIO_PODR_FECI2C_PODR_FECI2C0 (0x01) +#define MCF_GPIO_PODR_FECI2C_PODR_FECI2C1 (0x02) +#define MCF_GPIO_PODR_FECI2C_PODR_FECI2C2 (0x04) +#define MCF_GPIO_PODR_FECI2C_PODR_FECI2C3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PODR_UART */ +#define MCF_GPIO_PODR_UART_PODR_UART0 (0x01) +#define MCF_GPIO_PODR_UART_PODR_UART1 (0x02) +#define MCF_GPIO_PODR_UART_PODR_UART2 (0x04) +#define MCF_GPIO_PODR_UART_PODR_UART3 (0x08) +#define MCF_GPIO_PODR_UART_PODR_UART4 (0x10) +#define MCF_GPIO_PODR_UART_PODR_UART5 (0x20) +#define MCF_GPIO_PODR_UART_PODR_UART6 (0x40) +#define MCF_GPIO_PODR_UART_PODR_UART7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PODR_QSPI */ +#define MCF_GPIO_PODR_QSPI_PODR_QSPI0 (0x01) +#define MCF_GPIO_PODR_QSPI_PODR_QSPI1 (0x02) +#define MCF_GPIO_PODR_QSPI_PODR_QSPI2 (0x04) +#define MCF_GPIO_PODR_QSPI_PODR_QSPI3 (0x08) +#define MCF_GPIO_PODR_QSPI_PODR_QSPI4 (0x10) +#define MCF_GPIO_PODR_QSPI_PODR_QSPI5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PODR_TIMER */ +#define MCF_GPIO_PODR_TIMER_PODR_TIMER0 (0x01) +#define MCF_GPIO_PODR_TIMER_PODR_TIMER1 (0x02) +#define MCF_GPIO_PODR_TIMER_PODR_TIMER2 (0x04) +#define MCF_GPIO_PODR_TIMER_PODR_TIMER3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PODR_LCDDATAH */ +#define MCF_GPIO_PODR_LCDDATAH_PODR_LCDDATAH0 (0x01) +#define MCF_GPIO_PODR_LCDDATAH_PODR_LCDDATAH1 (0x02) + +/* Bit definitions and macros for MCF_GPIO_PODR_LCDDATAM */ +#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM0 (0x01) +#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM1 (0x02) +#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM2 (0x04) +#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM3 (0x08) +#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM4 (0x10) +#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM5 (0x20) +#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM6 (0x40) +#define MCF_GPIO_PODR_LCDDATAM_PODR_LCDDATAM7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PODR_LCDDATAL */ +#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL0 (0x01) +#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL1 (0x02) +#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL2 (0x04) +#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL3 (0x08) +#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL4 (0x10) +#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL5 (0x20) +#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL6 (0x40) +#define MCF_GPIO_PODR_LCDDATAL_PODR_LCDDATAL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PODR_LCDCTLH */ +#define MCF_GPIO_PODR_LCDCTLH_PODR_LCDCTLH0 (0x01) + +/* Bit definitions and macros for MCF_GPIO_PODR_LCDCTLL */ +#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL0 (0x01) +#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL1 (0x02) +#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL2 (0x04) +#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL3 (0x08) +#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL4 (0x10) +#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL5 (0x20) +#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL6 (0x40) +#define MCF_GPIO_PODR_LCDCTLL_PODR_LCDCTLL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PDDR_FECH */ +#define MCF_GPIO_PDDR_FECH_PDDR_FECH0 (0x01) +#define MCF_GPIO_PDDR_FECH_PDDR_FECH1 (0x02) +#define MCF_GPIO_PDDR_FECH_PDDR_FECH2 (0x04) +#define MCF_GPIO_PDDR_FECH_PDDR_FECH3 (0x08) +#define MCF_GPIO_PDDR_FECH_PDDR_FECH4 (0x10) +#define MCF_GPIO_PDDR_FECH_PDDR_FECH5 (0x20) +#define MCF_GPIO_PDDR_FECH_PDDR_FECH6 (0x40) +#define MCF_GPIO_PDDR_FECH_PDDR_FECH7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PDDR_FECL */ +#define MCF_GPIO_PDDR_FECL_PDDR_FECL0 (0x01) +#define MCF_GPIO_PDDR_FECL_PDDR_FECL1 (0x02) +#define MCF_GPIO_PDDR_FECL_PDDR_FECL2 (0x04) +#define MCF_GPIO_PDDR_FECL_PDDR_FECL3 (0x08) +#define MCF_GPIO_PDDR_FECL_PDDR_FECL4 (0x10) +#define MCF_GPIO_PDDR_FECL_PDDR_FECL5 (0x20) +#define MCF_GPIO_PDDR_FECL_PDDR_FECL6 (0x40) +#define MCF_GPIO_PDDR_FECL_PDDR_FECL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PDDR_SSI */ +#define MCF_GPIO_PDDR_SSI_PDDR_SSI0 (0x01) +#define MCF_GPIO_PDDR_SSI_PDDR_SSI1 (0x02) +#define MCF_GPIO_PDDR_SSI_PDDR_SSI2 (0x04) +#define MCF_GPIO_PDDR_SSI_PDDR_SSI3 (0x08) +#define MCF_GPIO_PDDR_SSI_PDDR_SSI4 (0x10) + +/* Bit definitions and macros for MCF_GPIO_PDDR_BUSCTL */ +#define MCF_GPIO_PDDR_BUSCTL_POSDR_BUSCTL0 (0x01) +#define MCF_GPIO_PDDR_BUSCTL_PDDR_BUSCTL1 (0x02) +#define MCF_GPIO_PDDR_BUSCTL_PDDR_BUSCTL2 (0x04) +#define MCF_GPIO_PDDR_BUSCTL_PDDR_BUSCTL3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PDDR_BE */ +#define MCF_GPIO_PDDR_BE_PDDR_BE0 (0x01) +#define MCF_GPIO_PDDR_BE_PDDR_BE1 (0x02) +#define MCF_GPIO_PDDR_BE_PDDR_BE2 (0x04) +#define MCF_GPIO_PDDR_BE_PDDR_BE3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PDDR_CS */ +#define MCF_GPIO_PDDR_CS_PDDR_CS1 (0x02) +#define MCF_GPIO_PDDR_CS_PDDR_CS2 (0x04) +#define MCF_GPIO_PDDR_CS_PDDR_CS3 (0x08) +#define MCF_GPIO_PDDR_CS_PDDR_CS4 (0x10) +#define MCF_GPIO_PDDR_CS_PDDR_CS5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PDDR_PWM */ +#define MCF_GPIO_PDDR_PWM_PDDR_PWM2 (0x04) +#define MCF_GPIO_PDDR_PWM_PDDR_PWM3 (0x08) +#define MCF_GPIO_PDDR_PWM_PDDR_PWM4 (0x10) +#define MCF_GPIO_PDDR_PWM_PDDR_PWM5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PDDR_FECI2C */ +#define MCF_GPIO_PDDR_FECI2C_PDDR_FECI2C0 (0x01) +#define MCF_GPIO_PDDR_FECI2C_PDDR_FECI2C1 (0x02) +#define MCF_GPIO_PDDR_FECI2C_PDDR_FECI2C2 (0x04) +#define MCF_GPIO_PDDR_FECI2C_PDDR_FECI2C3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PDDR_UART */ +#define MCF_GPIO_PDDR_UART_PDDR_UART0 (0x01) +#define MCF_GPIO_PDDR_UART_PDDR_UART1 (0x02) +#define MCF_GPIO_PDDR_UART_PDDR_UART2 (0x04) +#define MCF_GPIO_PDDR_UART_PDDR_UART3 (0x08) +#define MCF_GPIO_PDDR_UART_PDDR_UART4 (0x10) +#define MCF_GPIO_PDDR_UART_PDDR_UART5 (0x20) +#define MCF_GPIO_PDDR_UART_PDDR_UART6 (0x40) +#define MCF_GPIO_PDDR_UART_PDDR_UART7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PDDR_QSPI */ +#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI0 (0x01) +#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI1 (0x02) +#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI2 (0x04) +#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI3 (0x08) +#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI4 (0x10) +#define MCF_GPIO_PDDR_QSPI_PDDR_QSPI5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PDDR_TIMER */ +#define MCF_GPIO_PDDR_TIMER_PDDR_TIMER0 (0x01) +#define MCF_GPIO_PDDR_TIMER_PDDR_TIMER1 (0x02) +#define MCF_GPIO_PDDR_TIMER_PDDR_TIMER2 (0x04) +#define MCF_GPIO_PDDR_TIMER_PDDR_TIMER3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PDDR_LCDDATAH */ +#define MCF_GPIO_PDDR_LCDDATAH_PDDR_LCDDATAH0 (0x01) +#define MCF_GPIO_PDDR_LCDDATAH_PDDR_LCDDATAH1 (0x02) + +/* Bit definitions and macros for MCF_GPIO_PDDR_LCDDATAM */ +#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM0 (0x01) +#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM1 (0x02) +#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM2 (0x04) +#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM3 (0x08) +#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM4 (0x10) +#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM5 (0x20) +#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM6 (0x40) +#define MCF_GPIO_PDDR_LCDDATAM_PDDR_LCDDATAM7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PDDR_LCDDATAL */ +#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL0 (0x01) +#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL1 (0x02) +#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL2 (0x04) +#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL3 (0x08) +#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL4 (0x10) +#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL5 (0x20) +#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL6 (0x40) +#define MCF_GPIO_PDDR_LCDDATAL_PDDR_LCDDATAL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PDDR_LCDCTLH */ +#define MCF_GPIO_PDDR_LCDCTLH_PDDR_LCDCTLH0 (0x01) + +/* Bit definitions and macros for MCF_GPIO_PDDR_LCDCTLL */ +#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL0 (0x01) +#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL1 (0x02) +#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL2 (0x04) +#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL3 (0x08) +#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL4 (0x10) +#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL5 (0x20) +#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL6 (0x40) +#define MCF_GPIO_PDDR_LCDCTLL_PDDR_LCDCTLL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_FECH */ +#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH0 (0x01) +#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH1 (0x02) +#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH2 (0x04) +#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH3 (0x08) +#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH4 (0x10) +#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH5 (0x20) +#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH6 (0x40) +#define MCF_GPIO_PPDSDR_FECH_PPDSDR_FECH7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_FECL */ +#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL0 (0x01) +#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL1 (0x02) +#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL2 (0x04) +#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL3 (0x08) +#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL4 (0x10) +#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL5 (0x20) +#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL6 (0x40) +#define MCF_GPIO_PPDSDR_FECL_PPDSDR_FECL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_SSI */ +#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI0 (0x01) +#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI1 (0x02) +#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI2 (0x04) +#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI3 (0x08) +#define MCF_GPIO_PPDSDR_SSI_PPDSDR_SSI4 (0x10) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_BUSCTL */ +#define MCF_GPIO_PPDSDR_BUSCTL_POSDR_BUSCTL0 (0x01) +#define MCF_GPIO_PPDSDR_BUSCTL_PPDSDR_BUSCTL1 (0x02) +#define MCF_GPIO_PPDSDR_BUSCTL_PPDSDR_BUSCTL2 (0x04) +#define MCF_GPIO_PPDSDR_BUSCTL_PPDSDR_BUSCTL3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_BE */ +#define MCF_GPIO_PPDSDR_BE_PPDSDR_BE0 (0x01) +#define MCF_GPIO_PPDSDR_BE_PPDSDR_BE1 (0x02) +#define MCF_GPIO_PPDSDR_BE_PPDSDR_BE2 (0x04) +#define MCF_GPIO_PPDSDR_BE_PPDSDR_BE3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_CS */ +#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS1 (0x02) +#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS2 (0x04) +#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS3 (0x08) +#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS4 (0x10) +#define MCF_GPIO_PPDSDR_CS_PPDSDR_CS5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_PWM */ +#define MCF_GPIO_PPDSDR_PWM_PPDSDR_PWM2 (0x04) +#define MCF_GPIO_PPDSDR_PWM_PPDSDR_PWM3 (0x08) +#define MCF_GPIO_PPDSDR_PWM_PPDSDR_PWM4 (0x10) +#define MCF_GPIO_PPDSDR_PWM_PPDSDR_PWM5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_FECI2C */ +#define MCF_GPIO_PPDSDR_FECI2C_PPDSDR_FECI2C0 (0x01) +#define MCF_GPIO_PPDSDR_FECI2C_PPDSDR_FECI2C1 (0x02) +#define MCF_GPIO_PPDSDR_FECI2C_PPDSDR_FECI2C2 (0x04) +#define MCF_GPIO_PPDSDR_FECI2C_PPDSDR_FECI2C3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_UART */ +#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART0 (0x01) +#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART1 (0x02) +#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART2 (0x04) +#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART3 (0x08) +#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART4 (0x10) +#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART5 (0x20) +#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART6 (0x40) +#define MCF_GPIO_PPDSDR_UART_PPDSDR_UART7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_QSPI */ +#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI0 (0x01) +#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI1 (0x02) +#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI2 (0x04) +#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI3 (0x08) +#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI4 (0x10) +#define MCF_GPIO_PPDSDR_QSPI_PPDSDR_QSPI5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_TIMER */ +#define MCF_GPIO_PPDSDR_TIMER_PPDSDR_TIMER0 (0x01) +#define MCF_GPIO_PPDSDR_TIMER_PPDSDR_TIMER1 (0x02) +#define MCF_GPIO_PPDSDR_TIMER_PPDSDR_TIMER2 (0x04) +#define MCF_GPIO_PPDSDR_TIMER_PPDSDR_TIMER3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDDATAH */ +#define MCF_GPIO_PPDSDR_LCDDATAH_PPDSDR_LCDDATAH0 (0x01) +#define MCF_GPIO_PPDSDR_LCDDATAH_PPDSDR_LCDDATAH1 (0x02) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDDATAM */ +#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM0 (0x01) +#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM1 (0x02) +#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM2 (0x04) +#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM3 (0x08) +#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM4 (0x10) +#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM5 (0x20) +#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM6 (0x40) +#define MCF_GPIO_PPDSDR_LCDDATAM_PPDSDR_LCDDATAM7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDDATAL */ +#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL0 (0x01) +#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL1 (0x02) +#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL2 (0x04) +#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL3 (0x08) +#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL4 (0x10) +#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL5 (0x20) +#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL6 (0x40) +#define MCF_GPIO_PPDSDR_LCDDATAL_PPDSDR_LCDDATAL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDCTLH */ +#define MCF_GPIO_PPDSDR_LCDCTLH_PPDSDR_LCDCTLH0 (0x01) + +/* Bit definitions and macros for MCF_GPIO_PPDSDR_LCDCTLL */ +#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL0 (0x01) +#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL1 (0x02) +#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL2 (0x04) +#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL3 (0x08) +#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL4 (0x10) +#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL5 (0x20) +#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL6 (0x40) +#define MCF_GPIO_PPDSDR_LCDCTLL_PPDSDR_LCDCTLL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_FECH */ +#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH0 (0x01) +#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH1 (0x02) +#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH2 (0x04) +#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH3 (0x08) +#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH4 (0x10) +#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH5 (0x20) +#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH6 (0x40) +#define MCF_GPIO_PCLRR_FECH_PCLRR_FECH7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_FECL */ +#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL0 (0x01) +#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL1 (0x02) +#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL2 (0x04) +#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL3 (0x08) +#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL4 (0x10) +#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL5 (0x20) +#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL6 (0x40) +#define MCF_GPIO_PCLRR_FECL_PCLRR_FECL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_SSI */ +#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI0 (0x01) +#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI1 (0x02) +#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI2 (0x04) +#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI3 (0x08) +#define MCF_GPIO_PCLRR_SSI_PCLRR_SSI4 (0x10) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_BUSCTL */ +#define MCF_GPIO_PCLRR_BUSCTL_POSDR_BUSCTL0 (0x01) +#define MCF_GPIO_PCLRR_BUSCTL_PCLRR_BUSCTL1 (0x02) +#define MCF_GPIO_PCLRR_BUSCTL_PCLRR_BUSCTL2 (0x04) +#define MCF_GPIO_PCLRR_BUSCTL_PCLRR_BUSCTL3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_BE */ +#define MCF_GPIO_PCLRR_BE_PCLRR_BE0 (0x01) +#define MCF_GPIO_PCLRR_BE_PCLRR_BE1 (0x02) +#define MCF_GPIO_PCLRR_BE_PCLRR_BE2 (0x04) +#define MCF_GPIO_PCLRR_BE_PCLRR_BE3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_CS */ +#define MCF_GPIO_PCLRR_CS_PCLRR_CS1 (0x02) +#define MCF_GPIO_PCLRR_CS_PCLRR_CS2 (0x04) +#define MCF_GPIO_PCLRR_CS_PCLRR_CS3 (0x08) +#define MCF_GPIO_PCLRR_CS_PCLRR_CS4 (0x10) +#define MCF_GPIO_PCLRR_CS_PCLRR_CS5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_PWM */ +#define MCF_GPIO_PCLRR_PWM_PCLRR_PWM2 (0x04) +#define MCF_GPIO_PCLRR_PWM_PCLRR_PWM3 (0x08) +#define MCF_GPIO_PCLRR_PWM_PCLRR_PWM4 (0x10) +#define MCF_GPIO_PCLRR_PWM_PCLRR_PWM5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_FECI2C */ +#define MCF_GPIO_PCLRR_FECI2C_PCLRR_FECI2C0 (0x01) +#define MCF_GPIO_PCLRR_FECI2C_PCLRR_FECI2C1 (0x02) +#define MCF_GPIO_PCLRR_FECI2C_PCLRR_FECI2C2 (0x04) +#define MCF_GPIO_PCLRR_FECI2C_PCLRR_FECI2C3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_UART */ +#define MCF_GPIO_PCLRR_UART_PCLRR_UART0 (0x01) +#define MCF_GPIO_PCLRR_UART_PCLRR_UART1 (0x02) +#define MCF_GPIO_PCLRR_UART_PCLRR_UART2 (0x04) +#define MCF_GPIO_PCLRR_UART_PCLRR_UART3 (0x08) +#define MCF_GPIO_PCLRR_UART_PCLRR_UART4 (0x10) +#define MCF_GPIO_PCLRR_UART_PCLRR_UART5 (0x20) +#define MCF_GPIO_PCLRR_UART_PCLRR_UART6 (0x40) +#define MCF_GPIO_PCLRR_UART_PCLRR_UART7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_QSPI */ +#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI0 (0x01) +#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI1 (0x02) +#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI2 (0x04) +#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI3 (0x08) +#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI4 (0x10) +#define MCF_GPIO_PCLRR_QSPI_PCLRR_QSPI5 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_TIMER */ +#define MCF_GPIO_PCLRR_TIMER_PCLRR_TIMER0 (0x01) +#define MCF_GPIO_PCLRR_TIMER_PCLRR_TIMER1 (0x02) +#define MCF_GPIO_PCLRR_TIMER_PCLRR_TIMER2 (0x04) +#define MCF_GPIO_PCLRR_TIMER_PCLRR_TIMER3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDDATAH */ +#define MCF_GPIO_PCLRR_LCDDATAH_PCLRR_LCDDATAH0 (0x01) +#define MCF_GPIO_PCLRR_LCDDATAH_PCLRR_LCDDATAH1 (0x02) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDDATAM */ +#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM0 (0x01) +#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM1 (0x02) +#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM2 (0x04) +#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM3 (0x08) +#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM4 (0x10) +#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM5 (0x20) +#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM6 (0x40) +#define MCF_GPIO_PCLRR_LCDDATAM_PCLRR_LCDDATAM7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDDATAL */ +#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL0 (0x01) +#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL1 (0x02) +#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL2 (0x04) +#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL3 (0x08) +#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL4 (0x10) +#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL5 (0x20) +#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL6 (0x40) +#define MCF_GPIO_PCLRR_LCDDATAL_PCLRR_LCDDATAL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDCTLH */ +#define MCF_GPIO_PCLRR_LCDCTLH_PCLRR_LCDCTLH0 (0x01) + +/* Bit definitions and macros for MCF_GPIO_PCLRR_LCDCTLL */ +#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL0 (0x01) +#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL1 (0x02) +#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL2 (0x04) +#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL3 (0x08) +#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL4 (0x10) +#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL5 (0x20) +#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL6 (0x40) +#define MCF_GPIO_PCLRR_LCDCTLL_PCLRR_LCDCTLL7 (0x80) + +/* Bit definitions and macros for MCF_GPIO_PAR_FEC */ +#define MCF_GPIO_PAR_FEC_PAR_FEC_MII(x) (((x)&0x03)<<0) +#define MCF_GPIO_PAR_FEC_PAR_FEC_7W(x) (((x)&0x03)<<2) +#define MCF_GPIO_PAR_FEC_PAR_FEC_7W_GPIO (0x00) +#define MCF_GPIO_PAR_FEC_PAR_FEC_7W_URTS1 (0x04) +#define MCF_GPIO_PAR_FEC_PAR_FEC_7W_FEC (0x0C) +#define MCF_GPIO_PAR_FEC_PAR_FEC_MII_GPIO (0x00) +#define MCF_GPIO_PAR_FEC_PAR_FEC_MII_UART (0x01) +#define MCF_GPIO_PAR_FEC_PAR_FEC_MII_FEC (0x03) + +/* Bit definitions and macros for MCF_GPIO_PAR_PWM */ +#define MCF_GPIO_PAR_PWM_PAR_PWM1(x) (((x)&0x03)<<0) +#define MCF_GPIO_PAR_PWM_PAR_PWM3(x) (((x)&0x03)<<2) +#define MCF_GPIO_PAR_PWM_PAR_PWM5 (0x10) +#define MCF_GPIO_PAR_PWM_PAR_PWM7 (0x20) + +/* Bit definitions and macros for MCF_GPIO_PAR_BUSCTL */ +#define MCF_GPIO_PAR_BUSCTL_PAR_TS(x) (((x)&0x03)<<3) +#define MCF_GPIO_PAR_BUSCTL_PAR_RWB (0x20) +#define MCF_GPIO_PAR_BUSCTL_PAR_TA (0x40) +#define MCF_GPIO_PAR_BUSCTL_PAR_OE (0x80) +#define MCF_GPIO_PAR_BUSCTL_PAR_OE_GPIO (0x00) +#define MCF_GPIO_PAR_BUSCTL_PAR_OE_OE (0x80) +#define MCF_GPIO_PAR_BUSCTL_PAR_TA_GPIO (0x00) +#define MCF_GPIO_PAR_BUSCTL_PAR_TA_TA (0x40) +#define MCF_GPIO_PAR_BUSCTL_PAR_RWB_GPIO (0x00) +#define MCF_GPIO_PAR_BUSCTL_PAR_RWB_RWB (0x20) +#define MCF_GPIO_PAR_BUSCTL_PAR_TS_GPIO (0x00) +#define MCF_GPIO_PAR_BUSCTL_PAR_TS_DACK0 (0x10) +#define MCF_GPIO_PAR_BUSCTL_PAR_TS_TS (0x18) + +/* Bit definitions and macros for MCF_GPIO_PAR_FECI2C */ +#define MCF_GPIO_PAR_FECI2C_PAR_SDA(x) (((x)&0x03)<<0) +#define MCF_GPIO_PAR_FECI2C_PAR_SCL(x) (((x)&0x03)<<2) +#define MCF_GPIO_PAR_FECI2C_PAR_MDIO(x) (((x)&0x03)<<4) +#define MCF_GPIO_PAR_FECI2C_PAR_MDC(x) (((x)&0x03)<<6) +#define MCF_GPIO_PAR_FECI2C_PAR_MDC_GPIO (0x00) +#define MCF_GPIO_PAR_FECI2C_PAR_MDC_UTXD2 (0x40) +#define MCF_GPIO_PAR_FECI2C_PAR_MDC_SCL (0x80) +#define MCF_GPIO_PAR_FECI2C_PAR_MDC_EMDC (0xC0) +#define MCF_GPIO_PAR_FECI2C_PAR_MDIO_GPIO (0x00) +#define MCF_GPIO_PAR_FECI2C_PAR_MDIO_URXD2 (0x10) +#define MCF_GPIO_PAR_FECI2C_PAR_MDIO_SDA (0x20) +#define MCF_GPIO_PAR_FECI2C_PAR_MDIO_EMDIO (0x30) +#define MCF_GPIO_PAR_FECI2C_PAR_SCL_GPIO (0x00) +#define MCF_GPIO_PAR_FECI2C_PAR_SCL_UTXD2 (0x04) +#define MCF_GPIO_PAR_FECI2C_PAR_SCL_SCL (0x0C) +#define MCF_GPIO_PAR_FECI2C_PAR_SDA_GPIO (0x00) +#define MCF_GPIO_PAR_FECI2C_PAR_SDA_URXD2 (0x02) +#define MCF_GPIO_PAR_FECI2C_PAR_SDA_SDA (0x03) + +/* Bit definitions and macros for MCF_GPIO_PAR_BE */ +#define MCF_GPIO_PAR_BE_PAR_BE0 (0x01) +#define MCF_GPIO_PAR_BE_PAR_BE1 (0x02) +#define MCF_GPIO_PAR_BE_PAR_BE2 (0x04) +#define MCF_GPIO_PAR_BE_PAR_BE3 (0x08) + +/* Bit definitions and macros for MCF_GPIO_PAR_CS */ +#define MCF_GPIO_PAR_CS_PAR_CS1 (0x02) +#define MCF_GPIO_PAR_CS_PAR_CS2 (0x04) +#define MCF_GPIO_PAR_CS_PAR_CS3 (0x08) +#define MCF_GPIO_PAR_CS_PAR_CS4 (0x10) +#define MCF_GPIO_PAR_CS_PAR_CS5 (0x20) +#define MCF_GPIO_PAR_CS_PAR_CS_CS1_GPIO (0x00) +#define MCF_GPIO_PAR_CS_PAR_CS_CS1_SDCS1 (0x01) +#define MCF_GPIO_PAR_CS_PAR_CS_CS1_CS1 (0x03) + +/* Bit definitions and macros for MCF_GPIO_PAR_SSI */ +#define MCF_GPIO_PAR_SSI_PAR_MCLK (0x0080) +#define MCF_GPIO_PAR_SSI_PAR_TXD(x) (((x)&0x0003)<<8) +#define MCF_GPIO_PAR_SSI_PAR_RXD(x) (((x)&0x0003)<<10) +#define MCF_GPIO_PAR_SSI_PAR_FS(x) (((x)&0x0003)<<12) +#define MCF_GPIO_PAR_SSI_PAR_BCLK(x) (((x)&0x0003)<<14) + +/* Bit definitions and macros for MCF_GPIO_PAR_UART */ +#define MCF_GPIO_PAR_UART_PAR_UTXD0 (0x0001) +#define MCF_GPIO_PAR_UART_PAR_URXD0 (0x0002) +#define MCF_GPIO_PAR_UART_PAR_URTS0 (0x0004) +#define MCF_GPIO_PAR_UART_PAR_UCTS0 (0x0008) +#define MCF_GPIO_PAR_UART_PAR_UTXD1(x) (((x)&0x0003)<<4) +#define MCF_GPIO_PAR_UART_PAR_URXD1(x) (((x)&0x0003)<<6) +#define MCF_GPIO_PAR_UART_PAR_URTS1(x) (((x)&0x0003)<<8) +#define MCF_GPIO_PAR_UART_PAR_UCTS1(x) (((x)&0x0003)<<10) +#define MCF_GPIO_PAR_UART_PAR_UCTS1_GPIO (0x0000) +#define MCF_GPIO_PAR_UART_PAR_UCTS1_SSI_BCLK (0x0800) +#define MCF_GPIO_PAR_UART_PAR_UCTS1_ULPI_D7 (0x0400) +#define MCF_GPIO_PAR_UART_PAR_UCTS1_UCTS1 (0x0C00) +#define MCF_GPIO_PAR_UART_PAR_URTS1_GPIO (0x0000) +#define MCF_GPIO_PAR_UART_PAR_URTS1_SSI_FS (0x0200) +#define MCF_GPIO_PAR_UART_PAR_URTS1_ULPI_D6 (0x0100) +#define MCF_GPIO_PAR_UART_PAR_URTS1_URTS1 (0x0300) +#define MCF_GPIO_PAR_UART_PAR_URXD1_GPIO (0x0000) +#define MCF_GPIO_PAR_UART_PAR_URXD1_SSI_RXD (0x0080) +#define MCF_GPIO_PAR_UART_PAR_URXD1_ULPI_D5 (0x0040) +#define MCF_GPIO_PAR_UART_PAR_URXD1_URXD1 (0x00C0) +#define MCF_GPIO_PAR_UART_PAR_UTXD1_GPIO (0x0000) +#define MCF_GPIO_PAR_UART_PAR_UTXD1_SSI_TXD (0x0020) +#define MCF_GPIO_PAR_UART_PAR_UTXD1_ULPI_D4 (0x0010) +#define MCF_GPIO_PAR_UART_PAR_UTXD1_UTXD1 (0x0030) + +/* Bit definitions and macros for MCF_GPIO_PAR_QSPI */ +#define MCF_GPIO_PAR_QSPI_PAR_SCK(x) (((x)&0x0003)<<4) +#define MCF_GPIO_PAR_QSPI_PAR_DOUT(x) (((x)&0x0003)<<6) +#define MCF_GPIO_PAR_QSPI_PAR_DIN(x) (((x)&0x0003)<<8) +#define MCF_GPIO_PAR_QSPI_PAR_PCS0(x) (((x)&0x0003)<<10) +#define MCF_GPIO_PAR_QSPI_PAR_PCS1(x) (((x)&0x0003)<<12) +#define MCF_GPIO_PAR_QSPI_PAR_PCS2(x) (((x)&0x0003)<<14) + +/* Bit definitions and macros for MCF_GPIO_PAR_TIMER */ +#define MCF_GPIO_PAR_TIMER_PAR_TIN0(x) (((x)&0x03)<<0) +#define MCF_GPIO_PAR_TIMER_PAR_TIN1(x) (((x)&0x03)<<2) +#define MCF_GPIO_PAR_TIMER_PAR_TIN2(x) (((x)&0x03)<<4) +#define MCF_GPIO_PAR_TIMER_PAR_TIN3(x) (((x)&0x03)<<6) +#define MCF_GPIO_PAR_TIMER_PAR_TIN3_GPIO (0x00) +#define MCF_GPIO_PAR_TIMER_PAR_TIN3_TOUT3 (0x80) +#define MCF_GPIO_PAR_TIMER_PAR_TIN3_URXD2 (0x40) +#define MCF_GPIO_PAR_TIMER_PAR_TIN3_TIN3 (0xC0) +#define MCF_GPIO_PAR_TIMER_PAR_TIN2_GPIO (0x00) +#define MCF_GPIO_PAR_TIMER_PAR_TIN2_TOUT2 (0x20) +#define MCF_GPIO_PAR_TIMER_PAR_TIN2_UTXD2 (0x10) +#define MCF_GPIO_PAR_TIMER_PAR_TIN2_TIN2 (0x30) +#define MCF_GPIO_PAR_TIMER_PAR_TIN1_GPIO (0x00) +#define MCF_GPIO_PAR_TIMER_PAR_TIN1_TOUT1 (0x08) +#define MCF_GPIO_PAR_TIMER_PAR_TIN1_DACK1 (0x04) +#define MCF_GPIO_PAR_TIMER_PAR_TIN1_TIN1 (0x0C) +#define MCF_GPIO_PAR_TIMER_PAR_TIN0_GPIO (0x00) +#define MCF_GPIO_PAR_TIMER_PAR_TIN0_TOUT0 (0x02) +#define MCF_GPIO_PAR_TIMER_PAR_TIN0_DREQ0 (0x01) +#define MCF_GPIO_PAR_TIMER_PAR_TIN0_TIN0 (0x03) + +/* Bit definitions and macros for MCF_GPIO_PAR_LCDDATA */ +#define MCF_GPIO_PAR_LCDDATA_PAR_LD7_0(x) (((x)&0x03)<<0) +#define MCF_GPIO_PAR_LCDDATA_PAR_LD15_8(x) (((x)&0x03)<<2) +#define MCF_GPIO_PAR_LCDDATA_PAR_LD16(x) (((x)&0x03)<<4) +#define MCF_GPIO_PAR_LCDDATA_PAR_LD17(x) (((x)&0x03)<<6) + +/* Bit definitions and macros for MCF_GPIO_PAR_LCDCTL */ +#define MCF_GPIO_PAR_LCDCTL_PAR_CLS (0x0001) +#define MCF_GPIO_PAR_LCDCTL_PAR_PS (0x0002) +#define MCF_GPIO_PAR_LCDCTL_PAR_REV (0x0004) +#define MCF_GPIO_PAR_LCDCTL_PAR_SPL_SPR (0x0008) +#define MCF_GPIO_PAR_LCDCTL_PAR_CONTRAST (0x0010) +#define MCF_GPIO_PAR_LCDCTL_PAR_LSCLK (0x0020) +#define MCF_GPIO_PAR_LCDCTL_PAR_LP_HSYNC (0x0040) +#define MCF_GPIO_PAR_LCDCTL_PAR_FLM_VSYNC (0x0080) +#define MCF_GPIO_PAR_LCDCTL_PAR_ACD_OE (0x0100) + +/* Bit definitions and macros for MCF_GPIO_PAR_IRQ */ +#define MCF_GPIO_PAR_IRQ_PAR_IRQ1(x) (((x)&0x0003)<<4) +#define MCF_GPIO_PAR_IRQ_PAR_IRQ2(x) (((x)&0x0003)<<6) +#define MCF_GPIO_PAR_IRQ_PAR_IRQ4(x) (((x)&0x0003)<<8) +#define MCF_GPIO_PAR_IRQ_PAR_IRQ5(x) (((x)&0x0003)<<10) +#define MCF_GPIO_PAR_IRQ_PAR_IRQ6(x) (((x)&0x0003)<<12) + +/* Bit definitions and macros for MCF_GPIO_MSCR_FLEXBUS */ +#define MCF_GPIO_MSCR_FLEXBUS_MSCR_ADDRCTL(x) (((x)&0x03)<<0) +#define MCF_GPIO_MSCR_FLEXBUS_MSCR_DLOWER(x) (((x)&0x03)<<2) +#define MCF_GPIO_MSCR_FLEXBUS_MSCR_DUPPER(x) (((x)&0x03)<<4) + +/* Bit definitions and macros for MCF_GPIO_MSCR_SDRAM */ +#define MCF_GPIO_MSCR_SDRAM_MSCR_SDRAM(x) (((x)&0x03)<<0) +#define MCF_GPIO_MSCR_SDRAM_MSCR_SDCLK(x) (((x)&0x03)<<2) +#define MCF_GPIO_MSCR_SDRAM_MSCR_SDCLKB(x) (((x)&0x03)<<4) + +/* Bit definitions and macros for MCF_GPIO_DSCR_I2C */ +#define MCF_GPIO_DSCR_I2C_I2C_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_PWM */ +#define MCF_GPIO_DSCR_PWM_PWM_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_FEC */ +#define MCF_GPIO_DSCR_FEC_FEC_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_UART */ +#define MCF_GPIO_DSCR_UART_UART0_DSE(x) (((x)&0x03)<<0) +#define MCF_GPIO_DSCR_UART_UART1_DSE(x) (((x)&0x03)<<2) + +/* Bit definitions and macros for MCF_GPIO_DSCR_QSPI */ +#define MCF_GPIO_DSCR_QSPI_QSPI_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_TIMER */ +#define MCF_GPIO_DSCR_TIMER_TIMER_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_SSI */ +#define MCF_GPIO_DSCR_SSI_SSI_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_LCD */ +#define MCF_GPIO_DSCR_LCD_LCD_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_DEBUG */ +#define MCF_GPIO_DSCR_DEBUG_DEBUG_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_CLKRST */ +#define MCF_GPIO_DSCR_CLKRST_CLKRST_DSE(x) (((x)&0x03)<<0) + +/* Bit definitions and macros for MCF_GPIO_DSCR_IRQ */ +#define MCF_GPIO_DSCR_IRQ_IRQ_DSE(x) (((x)&0x03)<<0) + +/* + * Generic GPIO support + */ +#define MCFGPIO_PODR MCFGPIO_PODR_FECH +#define MCFGPIO_PDDR MCFGPIO_PDDR_FECH +#define MCFGPIO_PPDR MCFGPIO_PPDSDR_FECH +#define MCFGPIO_SETR MCFGPIO_PPDSDR_FECH +#define MCFGPIO_CLRR MCFGPIO_PCLRR_FECH + +#define MCFGPIO_PIN_MAX 136 +#define MCFGPIO_IRQ_MAX 8 +#define MCFGPIO_IRQ_VECBASE MCFINT_VECBASE + +/********************************************************************* + * + * Phase Locked Loop (PLL) + * + *********************************************************************/ + +/* Register read/write macros */ +#define MCF_PLL_PODR 0xFC0C0000 +#define MCF_PLL_PLLCR 0xFC0C0004 +#define MCF_PLL_PMDR 0xFC0C0008 +#define MCF_PLL_PFDR 0xFC0C000C + +/* Bit definitions and macros for MCF_PLL_PODR */ +#define MCF_PLL_PODR_BUSDIV(x) (((x)&0x0F)<<0) +#define MCF_PLL_PODR_CPUDIV(x) (((x)&0x0F)<<4) + +/* Bit definitions and macros for MCF_PLL_PLLCR */ +#define MCF_PLL_PLLCR_DITHDEV(x) (((x)&0x07)<<0) +#define MCF_PLL_PLLCR_DITHEN (0x80) + +/* Bit definitions and macros for MCF_PLL_PMDR */ +#define MCF_PLL_PMDR_MODDIV(x) (((x)&0xFF)<<0) + +/* Bit definitions and macros for MCF_PLL_PFDR */ +#define MCF_PLL_PFDR_MFD(x) (((x)&0xFF)<<0) + +/********************************************************************* + * + * System Control Module Registers (SCM) + * + *********************************************************************/ + +/* Register read/write macros */ +#define MCF_SCM_MPR 0xFC000000 +#define MCF_SCM_PACRA 0xFC000020 +#define MCF_SCM_PACRB 0xFC000024 +#define MCF_SCM_PACRC 0xFC000028 +#define MCF_SCM_PACRD 0xFC00002C +#define MCF_SCM_PACRE 0xFC000040 +#define MCF_SCM_PACRF 0xFC000044 + +#define MCF_SCM_BCR 0xFC040024 + +/********************************************************************* + * + * SDRAM Controller (SDRAMC) + * + *********************************************************************/ + +/* Register read/write macros */ +#define MCF_SDRAMC_SDMR 0xFC0B8000 +#define MCF_SDRAMC_SDCR 0xFC0B8004 +#define MCF_SDRAMC_SDCFG1 0xFC0B8008 +#define MCF_SDRAMC_SDCFG2 0xFC0B800C +#define MCF_SDRAMC_LIMP_FIX 0xFC0B8080 +#define MCF_SDRAMC_SDDS 0xFC0B8100 +#define MCF_SDRAMC_SDCS0 0xFC0B8110 +#define MCF_SDRAMC_SDCS1 0xFC0B8114 +#define MCF_SDRAMC_SDCS2 0xFC0B8118 +#define MCF_SDRAMC_SDCS3 0xFC0B811C + +/* Bit definitions and macros for MCF_SDRAMC_SDMR */ +#define MCF_SDRAMC_SDMR_CMD (0x00010000) +#define MCF_SDRAMC_SDMR_AD(x) (((x)&0x00000FFF)<<18) +#define MCF_SDRAMC_SDMR_BNKAD(x) (((x)&0x00000003)<<30) +#define MCF_SDRAMC_SDMR_BNKAD_LMR (0x00000000) +#define MCF_SDRAMC_SDMR_BNKAD_LEMR (0x40000000) + +/* Bit definitions and macros for MCF_SDRAMC_SDCR */ +#define MCF_SDRAMC_SDCR_IPALL (0x00000002) +#define MCF_SDRAMC_SDCR_IREF (0x00000004) +#define MCF_SDRAMC_SDCR_DQS_OE(x) (((x)&0x0000000F)<<8) +#define MCF_SDRAMC_SDCR_PS(x) (((x)&0x00000003)<<12) +#define MCF_SDRAMC_SDCR_RCNT(x) (((x)&0x0000003F)<<16) +#define MCF_SDRAMC_SDCR_OE_RULE (0x00400000) +#define MCF_SDRAMC_SDCR_MUX(x) (((x)&0x00000003)<<24) +#define MCF_SDRAMC_SDCR_REF (0x10000000) +#define MCF_SDRAMC_SDCR_DDR (0x20000000) +#define MCF_SDRAMC_SDCR_CKE (0x40000000) +#define MCF_SDRAMC_SDCR_MODE_EN (0x80000000) +#define MCF_SDRAMC_SDCR_PS_16 (0x00002000) +#define MCF_SDRAMC_SDCR_PS_32 (0x00000000) + +/* Bit definitions and macros for MCF_SDRAMC_SDCFG1 */ +#define MCF_SDRAMC_SDCFG1_WTLAT(x) (((x)&0x00000007)<<4) +#define MCF_SDRAMC_SDCFG1_REF2ACT(x) (((x)&0x0000000F)<<8) +#define MCF_SDRAMC_SDCFG1_PRE2ACT(x) (((x)&0x00000007)<<12) +#define MCF_SDRAMC_SDCFG1_ACT2RW(x) (((x)&0x00000007)<<16) +#define MCF_SDRAMC_SDCFG1_RDLAT(x) (((x)&0x0000000F)<<20) +#define MCF_SDRAMC_SDCFG1_SWT2RD(x) (((x)&0x00000007)<<24) +#define MCF_SDRAMC_SDCFG1_SRD2RW(x) (((x)&0x0000000F)<<28) + +/* Bit definitions and macros for MCF_SDRAMC_SDCFG2 */ +#define MCF_SDRAMC_SDCFG2_BL(x) (((x)&0x0000000F)<<16) +#define MCF_SDRAMC_SDCFG2_BRD2WT(x) (((x)&0x0000000F)<<20) +#define MCF_SDRAMC_SDCFG2_BWT2RW(x) (((x)&0x0000000F)<<24) +#define MCF_SDRAMC_SDCFG2_BRD2PRE(x) (((x)&0x0000000F)<<28) + +/* Device Errata - LIMP mode work around */ +#define MCF_SDRAMC_REFRESH (0x40000000) + +/* Bit definitions and macros for MCF_SDRAMC_SDDS */ +#define MCF_SDRAMC_SDDS_SB_D(x) (((x)&0x00000003)<<0) +#define MCF_SDRAMC_SDDS_SB_S(x) (((x)&0x00000003)<<2) +#define MCF_SDRAMC_SDDS_SB_A(x) (((x)&0x00000003)<<4) +#define MCF_SDRAMC_SDDS_SB_C(x) (((x)&0x00000003)<<6) +#define MCF_SDRAMC_SDDS_SB_E(x) (((x)&0x00000003)<<8) + +/* Bit definitions and macros for MCF_SDRAMC_SDCS */ +#define MCF_SDRAMC_SDCS_CSSZ(x) (((x)&0x0000001F)<<0) +#define MCF_SDRAMC_SDCS_BASE(x) (((x)&0x00000FFF)<<20) +#define MCF_SDRAMC_SDCS_BA(x) ((x)&0xFFF00000) +#define MCF_SDRAMC_SDCS_CSSZ_DIABLE (0x00000000) +#define MCF_SDRAMC_SDCS_CSSZ_1MBYTE (0x00000013) +#define MCF_SDRAMC_SDCS_CSSZ_2MBYTE (0x00000014) +#define MCF_SDRAMC_SDCS_CSSZ_4MBYTE (0x00000015) +#define MCF_SDRAMC_SDCS_CSSZ_8MBYTE (0x00000016) +#define MCF_SDRAMC_SDCS_CSSZ_16MBYTE (0x00000017) +#define MCF_SDRAMC_SDCS_CSSZ_32MBYTE (0x00000018) +#define MCF_SDRAMC_SDCS_CSSZ_64MBYTE (0x00000019) +#define MCF_SDRAMC_SDCS_CSSZ_128MBYTE (0x0000001A) +#define MCF_SDRAMC_SDCS_CSSZ_256MBYTE (0x0000001B) +#define MCF_SDRAMC_SDCS_CSSZ_512MBYTE (0x0000001C) +#define MCF_SDRAMC_SDCS_CSSZ_1GBYTE (0x0000001D) +#define MCF_SDRAMC_SDCS_CSSZ_2GBYTE (0x0000001E) +#define MCF_SDRAMC_SDCS_CSSZ_4GBYTE (0x0000001F) + +/* + * Edge Port Module (EPORT) + */ +#define MCFEPORT_EPPAR (0xFC094000) +#define MCFEPORT_EPDDR (0xFC094002) +#define MCFEPORT_EPIER (0xFC094003) +#define MCFEPORT_EPDR (0xFC094004) +#define MCFEPORT_EPPDR (0xFC094005) +#define MCFEPORT_EPFR (0xFC094006) + +/********************************************************************/ +#endif /* m53xxsim_h */ diff --git a/arch/m68k/include/asm/m54xxacr.h b/arch/m68k/include/asm/m54xxacr.h index 192bbfeabf70..6d13cae44af5 100644 --- a/arch/m68k/include/asm/m54xxacr.h +++ b/arch/m68k/include/asm/m54xxacr.h @@ -96,8 +96,13 @@ */ #define ACR0_MODE (ACR_BA(CONFIG_MBAR)+ACR_ADMSK(0x1000000)+ \ ACR_ENABLE+ACR_SUPER+ACR_CM_OFF_PRE+ACR_SP) +#if defined(CONFIG_CACHE_COPYBACK) #define ACR1_MODE (ACR_BA(CONFIG_RAMBASE)+ACR_ADMSK(CONFIG_RAMSIZE)+ \ - ACR_ENABLE+ACR_SUPER+ACR_SP) + ACR_ENABLE+ACR_SUPER+ACR_SP+ACR_CM_CP) +#else +#define ACR1_MODE (ACR_BA(CONFIG_RAMBASE)+ACR_ADMSK(CONFIG_RAMSIZE)+ \ + ACR_ENABLE+ACR_SUPER+ACR_SP+ACR_CM_WT) +#endif #define ACR2_MODE 0 #define ACR3_MODE (ACR_BA(CONFIG_RAMBASE)+ACR_ADMSK(CONFIG_RAMSIZE)+ \ ACR_ENABLE+ACR_SUPER+ACR_SP) diff --git a/arch/m68k/include/asm/mcfgpio.h b/arch/m68k/include/asm/mcfgpio.h index fa1059f50dfc..c41ebf45f1d0 100644 --- a/arch/m68k/include/asm/mcfgpio.h +++ b/arch/m68k/include/asm/mcfgpio.h @@ -104,7 +104,7 @@ static inline void gpio_free(unsigned gpio) #if defined(CONFIG_M5206) || defined(CONFIG_M5206e) || \ defined(CONFIG_M520x) || defined(CONFIG_M523x) || \ defined(CONFIG_M527x) || defined(CONFIG_M528x) || \ - defined(CONFIG_M532x) || defined(CONFIG_M54xx) || \ + defined(CONFIG_M53xx) || defined(CONFIG_M54xx) || \ defined(CONFIG_M5441x) /* These parts have GPIO organized by 8 bit ports */ @@ -139,7 +139,7 @@ static inline void gpio_free(unsigned gpio) #if defined(CONFIG_M520x) || defined(CONFIG_M523x) || \ defined(CONFIG_M527x) || defined(CONFIG_M528x) || \ - defined(CONFIG_M532x) || defined(CONFIG_M5441x) + defined(CONFIG_M53xx) || defined(CONFIG_M5441x) /* * These parts have an 'Edge' Port module (external interrupt/GPIO) which uses * read-modify-write to change an output and a GPIO module which has separate @@ -195,7 +195,7 @@ static inline u32 __mcfgpio_ppdr(unsigned gpio) return MCFSIM2_GPIO1READ; #elif defined(CONFIG_M520x) || defined(CONFIG_M523x) || \ defined(CONFIG_M527x) || defined(CONFIG_M528x) || \ - defined(CONFIG_M532x) || defined(CONFIG_M5441x) + defined(CONFIG_M53xx) || defined(CONFIG_M5441x) #if !defined(CONFIG_M5441x) if (gpio < 8) return MCFEPORT_EPPDR; @@ -237,7 +237,7 @@ static inline u32 __mcfgpio_podr(unsigned gpio) return MCFSIM2_GPIO1WRITE; #elif defined(CONFIG_M520x) || defined(CONFIG_M523x) || \ defined(CONFIG_M527x) || defined(CONFIG_M528x) || \ - defined(CONFIG_M532x) || defined(CONFIG_M5441x) + defined(CONFIG_M53xx) || defined(CONFIG_M5441x) #if !defined(CONFIG_M5441x) if (gpio < 8) return MCFEPORT_EPDR; @@ -279,7 +279,7 @@ static inline u32 __mcfgpio_pddr(unsigned gpio) return MCFSIM2_GPIO1ENABLE; #elif defined(CONFIG_M520x) || defined(CONFIG_M523x) || \ defined(CONFIG_M527x) || defined(CONFIG_M528x) || \ - defined(CONFIG_M532x) || defined(CONFIG_M5441x) + defined(CONFIG_M53xx) || defined(CONFIG_M5441x) #if !defined(CONFIG_M5441x) if (gpio < 8) return MCFEPORT_EPDDR; diff --git a/arch/m68k/include/asm/mcfsim.h b/arch/m68k/include/asm/mcfsim.h index a04fd9b2714c..bc867de8a1e9 100644 --- a/arch/m68k/include/asm/mcfsim.h +++ b/arch/m68k/include/asm/mcfsim.h @@ -36,8 +36,8 @@ #elif defined(CONFIG_M5307) #include #include -#elif defined(CONFIG_M532x) -#include +#elif defined(CONFIG_M53xx) +#include #elif defined(CONFIG_M5407) #include #include diff --git a/arch/m68k/include/asm/mcftimer.h b/arch/m68k/include/asm/mcftimer.h index da2fa43c2e45..089f0f150bbf 100644 --- a/arch/m68k/include/asm/mcftimer.h +++ b/arch/m68k/include/asm/mcftimer.h @@ -19,7 +19,7 @@ #define MCFTIMER_TRR 0x04 /* Timer Reference (r/w) */ #define MCFTIMER_TCR 0x08 /* Timer Capture reg (r/w) */ #define MCFTIMER_TCN 0x0C /* Timer Counter reg (r/w) */ -#if defined(CONFIG_M532x) || defined(CONFIG_M5441x) +#if defined(CONFIG_M53xx) || defined(CONFIG_M5441x) #define MCFTIMER_TER 0x03 /* Timer Event reg (r/w) */ #else #define MCFTIMER_TER 0x11 /* Timer Event reg (r/w) */ diff --git a/arch/m68k/platform/coldfire/Makefile b/arch/m68k/platform/coldfire/Makefile index 02591a109f8c..68f0fac60099 100644 --- a/arch/m68k/platform/coldfire/Makefile +++ b/arch/m68k/platform/coldfire/Makefile @@ -25,7 +25,7 @@ obj-$(CONFIG_M527x) += m527x.o pit.o intc-2.o reset.o obj-$(CONFIG_M5272) += m5272.o intc-5272.o timers.o obj-$(CONFIG_M528x) += m528x.o pit.o intc-2.o reset.o obj-$(CONFIG_M5307) += m5307.o timers.o intc.o reset.o -obj-$(CONFIG_M532x) += m532x.o timers.o intc-simr.o reset.o +obj-$(CONFIG_M53xx) += m53xx.o timers.o intc-simr.o reset.o obj-$(CONFIG_M5407) += m5407.o timers.o intc.o reset.o obj-$(CONFIG_M54xx) += m54xx.o sltimers.o intc-2.o obj-$(CONFIG_M5441x) += m5441x.o pit.o intc-simr.o reset.o diff --git a/arch/m68k/platform/coldfire/m532x.c b/arch/m68k/platform/coldfire/m532x.c deleted file mode 100644 index 7951d1d43357..000000000000 --- a/arch/m68k/platform/coldfire/m532x.c +++ /dev/null @@ -1,593 +0,0 @@ -/***************************************************************************/ - -/* - * linux/arch/m68knommu/platform/532x/config.c - * - * Copyright (C) 1999-2002, Greg Ungerer (gerg@snapgear.com) - * Copyright (C) 2000, Lineo (www.lineo.com) - * Yaroslav Vinogradov yaroslav.vinogradov@freescale.com - * Copyright Freescale Semiconductor, Inc 2006 - * Copyright (c) 2006, emlix, Sebastian Hess - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - */ - -/***************************************************************************/ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/***************************************************************************/ - -DEFINE_CLK(0, "flexbus", 2, MCF_CLK); -DEFINE_CLK(0, "mcfcan.0", 8, MCF_CLK); -DEFINE_CLK(0, "fec.0", 12, MCF_CLK); -DEFINE_CLK(0, "edma", 17, MCF_CLK); -DEFINE_CLK(0, "intc.0", 18, MCF_CLK); -DEFINE_CLK(0, "intc.1", 19, MCF_CLK); -DEFINE_CLK(0, "iack.0", 21, MCF_CLK); -DEFINE_CLK(0, "mcfi2c.0", 22, MCF_CLK); -DEFINE_CLK(0, "mcfqspi.0", 23, MCF_CLK); -DEFINE_CLK(0, "mcfuart.0", 24, MCF_BUSCLK); -DEFINE_CLK(0, "mcfuart.1", 25, MCF_BUSCLK); -DEFINE_CLK(0, "mcfuart.2", 26, MCF_BUSCLK); -DEFINE_CLK(0, "mcftmr.0", 28, MCF_CLK); -DEFINE_CLK(0, "mcftmr.1", 29, MCF_CLK); -DEFINE_CLK(0, "mcftmr.2", 30, MCF_CLK); -DEFINE_CLK(0, "mcftmr.3", 31, MCF_CLK); - -DEFINE_CLK(0, "mcfpit.0", 32, MCF_CLK); -DEFINE_CLK(0, "mcfpit.1", 33, MCF_CLK); -DEFINE_CLK(0, "mcfpit.2", 34, MCF_CLK); -DEFINE_CLK(0, "mcfpit.3", 35, MCF_CLK); -DEFINE_CLK(0, "mcfpwm.0", 36, MCF_CLK); -DEFINE_CLK(0, "mcfeport.0", 37, MCF_CLK); -DEFINE_CLK(0, "mcfwdt.0", 38, MCF_CLK); -DEFINE_CLK(0, "sys.0", 40, MCF_BUSCLK); -DEFINE_CLK(0, "gpio.0", 41, MCF_BUSCLK); -DEFINE_CLK(0, "mcfrtc.0", 42, MCF_CLK); -DEFINE_CLK(0, "mcflcd.0", 43, MCF_CLK); -DEFINE_CLK(0, "mcfusb-otg.0", 44, MCF_CLK); -DEFINE_CLK(0, "mcfusb-host.0", 45, MCF_CLK); -DEFINE_CLK(0, "sdram.0", 46, MCF_CLK); -DEFINE_CLK(0, "ssi.0", 47, MCF_CLK); -DEFINE_CLK(0, "pll.0", 48, MCF_CLK); - -DEFINE_CLK(1, "mdha.0", 32, MCF_CLK); -DEFINE_CLK(1, "skha.0", 33, MCF_CLK); -DEFINE_CLK(1, "rng.0", 34, MCF_CLK); - -struct clk *mcf_clks[] = { - &__clk_0_2, /* flexbus */ - &__clk_0_8, /* mcfcan.0 */ - &__clk_0_12, /* fec.0 */ - &__clk_0_17, /* edma */ - &__clk_0_18, /* intc.0 */ - &__clk_0_19, /* intc.1 */ - &__clk_0_21, /* iack.0 */ - &__clk_0_22, /* mcfi2c.0 */ - &__clk_0_23, /* mcfqspi.0 */ - &__clk_0_24, /* mcfuart.0 */ - &__clk_0_25, /* mcfuart.1 */ - &__clk_0_26, /* mcfuart.2 */ - &__clk_0_28, /* mcftmr.0 */ - &__clk_0_29, /* mcftmr.1 */ - &__clk_0_30, /* mcftmr.2 */ - &__clk_0_31, /* mcftmr.3 */ - - &__clk_0_32, /* mcfpit.0 */ - &__clk_0_33, /* mcfpit.1 */ - &__clk_0_34, /* mcfpit.2 */ - &__clk_0_35, /* mcfpit.3 */ - &__clk_0_36, /* mcfpwm.0 */ - &__clk_0_37, /* mcfeport.0 */ - &__clk_0_38, /* mcfwdt.0 */ - &__clk_0_40, /* sys.0 */ - &__clk_0_41, /* gpio.0 */ - &__clk_0_42, /* mcfrtc.0 */ - &__clk_0_43, /* mcflcd.0 */ - &__clk_0_44, /* mcfusb-otg.0 */ - &__clk_0_45, /* mcfusb-host.0 */ - &__clk_0_46, /* sdram.0 */ - &__clk_0_47, /* ssi.0 */ - &__clk_0_48, /* pll.0 */ - - &__clk_1_32, /* mdha.0 */ - &__clk_1_33, /* skha.0 */ - &__clk_1_34, /* rng.0 */ - NULL, -}; - -static struct clk * const enable_clks[] __initconst = { - &__clk_0_2, /* flexbus */ - &__clk_0_18, /* intc.0 */ - &__clk_0_19, /* intc.1 */ - &__clk_0_21, /* iack.0 */ - &__clk_0_24, /* mcfuart.0 */ - &__clk_0_25, /* mcfuart.1 */ - &__clk_0_26, /* mcfuart.2 */ - - &__clk_0_32, /* mcfpit.0 */ - &__clk_0_33, /* mcfpit.1 */ - &__clk_0_37, /* mcfeport.0 */ - &__clk_0_40, /* sys.0 */ - &__clk_0_41, /* gpio.0 */ - &__clk_0_46, /* sdram.0 */ - &__clk_0_48, /* pll.0 */ -}; - -static struct clk * const disable_clks[] __initconst = { - &__clk_0_8, /* mcfcan.0 */ - &__clk_0_12, /* fec.0 */ - &__clk_0_17, /* edma */ - &__clk_0_22, /* mcfi2c.0 */ - &__clk_0_23, /* mcfqspi.0 */ - &__clk_0_28, /* mcftmr.0 */ - &__clk_0_29, /* mcftmr.1 */ - &__clk_0_30, /* mcftmr.2 */ - &__clk_0_31, /* mcftmr.3 */ - &__clk_0_34, /* mcfpit.2 */ - &__clk_0_35, /* mcfpit.3 */ - &__clk_0_36, /* mcfpwm.0 */ - &__clk_0_38, /* mcfwdt.0 */ - &__clk_0_42, /* mcfrtc.0 */ - &__clk_0_43, /* mcflcd.0 */ - &__clk_0_44, /* mcfusb-otg.0 */ - &__clk_0_45, /* mcfusb-host.0 */ - &__clk_0_47, /* ssi.0 */ - &__clk_1_32, /* mdha.0 */ - &__clk_1_33, /* skha.0 */ - &__clk_1_34, /* rng.0 */ -}; - - -static void __init m532x_clk_init(void) -{ - unsigned i; - - /* make sure these clocks are enabled */ - for (i = 0; i < ARRAY_SIZE(enable_clks); ++i) - __clk_init_enabled(enable_clks[i]); - /* make sure these clocks are disabled */ - for (i = 0; i < ARRAY_SIZE(disable_clks); ++i) - __clk_init_disabled(disable_clks[i]); -} - -/***************************************************************************/ - -#if IS_ENABLED(CONFIG_SPI_COLDFIRE_QSPI) - -static void __init m532x_qspi_init(void) -{ - /* setup QSPS pins for QSPI with gpio CS control */ - writew(0x01f0, MCFGPIO_PAR_QSPI); -} - -#endif /* IS_ENABLED(CONFIG_SPI_COLDFIRE_QSPI) */ - -/***************************************************************************/ - -static void __init m532x_uarts_init(void) -{ - /* UART GPIO initialization */ - writew(readw(MCFGPIO_PAR_UART) | 0x0FFF, MCFGPIO_PAR_UART); -} - -/***************************************************************************/ - -static void __init m532x_fec_init(void) -{ - u8 v; - - /* Set multi-function pins to ethernet mode for fec0 */ - v = readb(MCFGPIO_PAR_FECI2C); - v |= MCF_GPIO_PAR_FECI2C_PAR_MDC_EMDC | - MCF_GPIO_PAR_FECI2C_PAR_MDIO_EMDIO; - writeb(v, MCFGPIO_PAR_FECI2C); - - v = readb(MCFGPIO_PAR_FEC); - v = MCF_GPIO_PAR_FEC_PAR_FEC_7W_FEC | MCF_GPIO_PAR_FEC_PAR_FEC_MII_FEC; - writeb(v, MCFGPIO_PAR_FEC); -} - -/***************************************************************************/ - -void __init config_BSP(char *commandp, int size) -{ -#if !defined(CONFIG_BOOTPARAM) - /* Copy command line from FLASH to local buffer... */ - memcpy(commandp, (char *) 0x4000, 4); - if(strncmp(commandp, "kcl ", 4) == 0){ - memcpy(commandp, (char *) 0x4004, size); - commandp[size-1] = 0; - } else { - memset(commandp, 0, size); - } -#endif - mach_sched_init = hw_timer_init; - m532x_clk_init(); - m532x_uarts_init(); - m532x_fec_init(); -#if IS_ENABLED(CONFIG_SPI_COLDFIRE_QSPI) - m532x_qspi_init(); -#endif - -#ifdef CONFIG_BDM_DISABLE - /* - * Disable the BDM clocking. This also turns off most of the rest of - * the BDM device. This is good for EMC reasons. This option is not - * incompatible with the memory protection option. - */ - wdebug(MCFDEBUG_CSR, MCFDEBUG_CSR_PSTCLK); -#endif -} - -/***************************************************************************/ -/* Board initialization */ -/***************************************************************************/ -/* - * PLL min/max specifications - */ -#define MAX_FVCO 500000 /* KHz */ -#define MAX_FSYS 80000 /* KHz */ -#define MIN_FSYS 58333 /* KHz */ -#define FREF 16000 /* KHz */ - - -#define MAX_MFD 135 /* Multiplier */ -#define MIN_MFD 88 /* Multiplier */ -#define BUSDIV 6 /* Divider */ - -/* - * Low Power Divider specifications - */ -#define MIN_LPD (1 << 0) /* Divider (not encoded) */ -#define MAX_LPD (1 << 15) /* Divider (not encoded) */ -#define DEFAULT_LPD (1 << 1) /* Divider (not encoded) */ - -#define SYS_CLK_KHZ 80000 -#define SYSTEM_PERIOD 12.5 -/* - * SDRAM Timing Parameters - */ -#define SDRAM_BL 8 /* # of beats in a burst */ -#define SDRAM_TWR 2 /* in clocks */ -#define SDRAM_CASL 2.5 /* CASL in clocks */ -#define SDRAM_TRCD 2 /* in clocks */ -#define SDRAM_TRP 2 /* in clocks */ -#define SDRAM_TRFC 7 /* in clocks */ -#define SDRAM_TREFI 7800 /* in ns */ - -#define EXT_SRAM_ADDRESS (0xC0000000) -#define FLASH_ADDRESS (0x00000000) -#define SDRAM_ADDRESS (0x40000000) - -#define NAND_FLASH_ADDRESS (0xD0000000) - -int sys_clk_khz = 0; -int sys_clk_mhz = 0; - -void wtm_init(void); -void scm_init(void); -void gpio_init(void); -void fbcs_init(void); -void sdramc_init(void); -int clock_pll (int fsys, int flags); -int clock_limp (int); -int clock_exit_limp (void); -int get_sys_clock (void); - -asmlinkage void __init sysinit(void) -{ - sys_clk_khz = clock_pll(0, 0); - sys_clk_mhz = sys_clk_khz/1000; - - wtm_init(); - scm_init(); - gpio_init(); - fbcs_init(); - sdramc_init(); -} - -void wtm_init(void) -{ - /* Disable watchdog timer */ - writew(0, MCF_WTM_WCR); -} - -#define MCF_SCM_BCR_GBW (0x00000100) -#define MCF_SCM_BCR_GBR (0x00000200) - -void scm_init(void) -{ - /* All masters are trusted */ - writel(0x77777777, MCF_SCM_MPR); - - /* Allow supervisor/user, read/write, and trusted/untrusted - access to all slaves */ - writel(0, MCF_SCM_PACRA); - writel(0, MCF_SCM_PACRB); - writel(0, MCF_SCM_PACRC); - writel(0, MCF_SCM_PACRD); - writel(0, MCF_SCM_PACRE); - writel(0, MCF_SCM_PACRF); - - /* Enable bursts */ - writel(MCF_SCM_BCR_GBR | MCF_SCM_BCR_GBW, MCF_SCM_BCR); -} - - -void fbcs_init(void) -{ - writeb(0x3E, MCFGPIO_PAR_CS); - - /* Latch chip select */ - writel(0x10080000, MCF_FBCS1_CSAR); - - writel(0x002A3780, MCF_FBCS1_CSCR); - writel(MCF_FBCS_CSMR_BAM_2M | MCF_FBCS_CSMR_V, MCF_FBCS1_CSMR); - - /* Initialize latch to drive signals to inactive states */ - writew(0xffff, 0x10080000); - - /* External SRAM */ - writel(EXT_SRAM_ADDRESS, MCF_FBCS1_CSAR); - writel(MCF_FBCS_CSCR_PS_16 | - MCF_FBCS_CSCR_AA | - MCF_FBCS_CSCR_SBM | - MCF_FBCS_CSCR_WS(1), - MCF_FBCS1_CSCR); - writel(MCF_FBCS_CSMR_BAM_512K | MCF_FBCS_CSMR_V, MCF_FBCS1_CSMR); - - /* Boot Flash connected to FBCS0 */ - writel(FLASH_ADDRESS, MCF_FBCS0_CSAR); - writel(MCF_FBCS_CSCR_PS_16 | - MCF_FBCS_CSCR_BEM | - MCF_FBCS_CSCR_AA | - MCF_FBCS_CSCR_SBM | - MCF_FBCS_CSCR_WS(7), - MCF_FBCS0_CSCR); - writel(MCF_FBCS_CSMR_BAM_32M | MCF_FBCS_CSMR_V, MCF_FBCS0_CSMR); -} - -void sdramc_init(void) -{ - /* - * Check to see if the SDRAM has already been initialized - * by a run control tool - */ - if (!(readl(MCF_SDRAMC_SDCR) & MCF_SDRAMC_SDCR_REF)) { - /* SDRAM chip select initialization */ - - /* Initialize SDRAM chip select */ - writel(MCF_SDRAMC_SDCS_BA(SDRAM_ADDRESS) | - MCF_SDRAMC_SDCS_CSSZ(MCF_SDRAMC_SDCS_CSSZ_32MBYTE), - MCF_SDRAMC_SDCS0); - - /* - * Basic configuration and initialization - */ - writel(MCF_SDRAMC_SDCFG1_SRD2RW((int)((SDRAM_CASL + 2) + 0.5)) | - MCF_SDRAMC_SDCFG1_SWT2RD(SDRAM_TWR + 1) | - MCF_SDRAMC_SDCFG1_RDLAT((int)((SDRAM_CASL * 2) + 2)) | - MCF_SDRAMC_SDCFG1_ACT2RW((int)(SDRAM_TRCD + 0.5)) | - MCF_SDRAMC_SDCFG1_PRE2ACT((int)(SDRAM_TRP + 0.5)) | - MCF_SDRAMC_SDCFG1_REF2ACT((int)(SDRAM_TRFC + 0.5)) | - MCF_SDRAMC_SDCFG1_WTLAT(3), - MCF_SDRAMC_SDCFG1); - writel(MCF_SDRAMC_SDCFG2_BRD2PRE(SDRAM_BL / 2 + 1) | - MCF_SDRAMC_SDCFG2_BWT2RW(SDRAM_BL / 2 + SDRAM_TWR) | - MCF_SDRAMC_SDCFG2_BRD2WT((int)((SDRAM_CASL + SDRAM_BL / 2 - 1.0) + 0.5)) | - MCF_SDRAMC_SDCFG2_BL(SDRAM_BL - 1), - MCF_SDRAMC_SDCFG2); - - - /* - * Precharge and enable write to SDMR - */ - writel(MCF_SDRAMC_SDCR_MODE_EN | - MCF_SDRAMC_SDCR_CKE | - MCF_SDRAMC_SDCR_DDR | - MCF_SDRAMC_SDCR_MUX(1) | - MCF_SDRAMC_SDCR_RCNT((int)(((SDRAM_TREFI / (SYSTEM_PERIOD * 64)) - 1) + 0.5)) | - MCF_SDRAMC_SDCR_PS_16 | - MCF_SDRAMC_SDCR_IPALL, - MCF_SDRAMC_SDCR); - - /* - * Write extended mode register - */ - writel(MCF_SDRAMC_SDMR_BNKAD_LEMR | - MCF_SDRAMC_SDMR_AD(0x0) | - MCF_SDRAMC_SDMR_CMD, - MCF_SDRAMC_SDMR); - - /* - * Write mode register and reset DLL - */ - writel(MCF_SDRAMC_SDMR_BNKAD_LMR | - MCF_SDRAMC_SDMR_AD(0x163) | - MCF_SDRAMC_SDMR_CMD, - MCF_SDRAMC_SDMR); - - /* - * Execute a PALL command - */ - writel(readl(MCF_SDRAMC_SDCR) | MCF_SDRAMC_SDCR_IPALL, MCF_SDRAMC_SDCR); - - /* - * Perform two REF cycles - */ - writel(readl(MCF_SDRAMC_SDCR) | MCF_SDRAMC_SDCR_IREF, MCF_SDRAMC_SDCR); - writel(readl(MCF_SDRAMC_SDCR) | MCF_SDRAMC_SDCR_IREF, MCF_SDRAMC_SDCR); - - /* - * Write mode register and clear reset DLL - */ - writel(MCF_SDRAMC_SDMR_BNKAD_LMR | - MCF_SDRAMC_SDMR_AD(0x063) | - MCF_SDRAMC_SDMR_CMD, - MCF_SDRAMC_SDMR); - - /* - * Enable auto refresh and lock SDMR - */ - writel(readl(MCF_SDRAMC_SDCR) & ~MCF_SDRAMC_SDCR_MODE_EN, - MCF_SDRAMC_SDCR); - writel(MCF_SDRAMC_SDCR_REF | MCF_SDRAMC_SDCR_DQS_OE(0xC), - MCF_SDRAMC_SDCR); - } -} - -void gpio_init(void) -{ - /* Enable UART0 pins */ - writew(MCF_GPIO_PAR_UART_PAR_URXD0 | MCF_GPIO_PAR_UART_PAR_UTXD0, - MCFGPIO_PAR_UART); - - /* - * Initialize TIN3 as a GPIO output to enable the write - * half of the latch. - */ - writeb(0x00, MCFGPIO_PAR_TIMER); - writeb(0x08, MCFGPIO_PDDR_TIMER); - writeb(0x00, MCFGPIO_PCLRR_TIMER); -} - -int clock_pll(int fsys, int flags) -{ - int fref, temp, fout, mfd; - u32 i; - - fref = FREF; - - if (fsys == 0) { - /* Return current PLL output */ - mfd = readb(MCF_PLL_PFDR); - - return (fref * mfd / (BUSDIV * 4)); - } - - /* Check bounds of requested system clock */ - if (fsys > MAX_FSYS) - fsys = MAX_FSYS; - if (fsys < MIN_FSYS) - fsys = MIN_FSYS; - - /* Multiplying by 100 when calculating the temp value, - and then dividing by 100 to calculate the mfd allows - for exact values without needing to include floating - point libraries. */ - temp = 100 * fsys / fref; - mfd = 4 * BUSDIV * temp / 100; - - /* Determine the output frequency for selected values */ - fout = (fref * mfd / (BUSDIV * 4)); - - /* - * Check to see if the SDRAM has already been initialized. - * If it has then the SDRAM needs to be put into self refresh - * mode before reprogramming the PLL. - */ - if (readl(MCF_SDRAMC_SDCR) & MCF_SDRAMC_SDCR_REF) - /* Put SDRAM into self refresh mode */ - writel(readl(MCF_SDRAMC_SDCR) & ~MCF_SDRAMC_SDCR_CKE, - MCF_SDRAMC_SDCR); - - /* - * Initialize the PLL to generate the new system clock frequency. - * The device must be put into LIMP mode to reprogram the PLL. - */ - - /* Enter LIMP mode */ - clock_limp(DEFAULT_LPD); - - /* Reprogram PLL for desired fsys */ - writeb(MCF_PLL_PODR_CPUDIV(BUSDIV/3) | MCF_PLL_PODR_BUSDIV(BUSDIV), - MCF_PLL_PODR); - - writeb(mfd, MCF_PLL_PFDR); - - /* Exit LIMP mode */ - clock_exit_limp(); - - /* - * Return the SDRAM to normal operation if it is in use. - */ - if (readl(MCF_SDRAMC_SDCR) & MCF_SDRAMC_SDCR_REF) - /* Exit self refresh mode */ - writel(readl(MCF_SDRAMC_SDCR) | MCF_SDRAMC_SDCR_CKE, - MCF_SDRAMC_SDCR); - - /* Errata - workaround for SDRAM opeartion after exiting LIMP mode */ - writel(MCF_SDRAMC_REFRESH, MCF_SDRAMC_LIMP_FIX); - - /* wait for DQS logic to relock */ - for (i = 0; i < 0x200; i++) - ; - - return fout; -} - -int clock_limp(int div) -{ - u32 temp; - - /* Check bounds of divider */ - if (div < MIN_LPD) - div = MIN_LPD; - if (div > MAX_LPD) - div = MAX_LPD; - - /* Save of the current value of the SSIDIV so we don't - overwrite the value*/ - temp = readw(MCF_CCM_CDR) & MCF_CCM_CDR_SSIDIV(0xF); - - /* Apply the divider to the system clock */ - writew(MCF_CCM_CDR_LPDIV(div) | MCF_CCM_CDR_SSIDIV(temp), MCF_CCM_CDR); - - writew(readw(MCF_CCM_MISCCR) | MCF_CCM_MISCCR_LIMP, MCF_CCM_MISCCR); - - return (FREF/(3*(1 << div))); -} - -int clock_exit_limp(void) -{ - int fout; - - /* Exit LIMP mode */ - writew(readw(MCF_CCM_MISCCR) & ~MCF_CCM_MISCCR_LIMP, MCF_CCM_MISCCR); - - /* Wait for PLL to lock */ - while (!(readw(MCF_CCM_MISCCR) & MCF_CCM_MISCCR_PLL_LOCK)) - ; - - fout = get_sys_clock(); - - return fout; -} - -int get_sys_clock(void) -{ - int divider; - - /* Test to see if device is in LIMP mode */ - if (readw(MCF_CCM_MISCCR) & MCF_CCM_MISCCR_LIMP) { - divider = readw(MCF_CCM_CDR) & MCF_CCM_CDR_LPDIV(0xF); - return (FREF/(2 << divider)); - } - else - return (FREF * readb(MCF_PLL_PFDR)) / (BUSDIV * 4); -} diff --git a/arch/m68k/platform/coldfire/m53xx.c b/arch/m68k/platform/coldfire/m53xx.c new file mode 100644 index 000000000000..5286f98fbed0 --- /dev/null +++ b/arch/m68k/platform/coldfire/m53xx.c @@ -0,0 +1,592 @@ +/***************************************************************************/ + +/* + * m53xx.c -- platform support for ColdFire 53xx based boards + * + * Copyright (C) 1999-2002, Greg Ungerer (gerg@snapgear.com) + * Copyright (C) 2000, Lineo (www.lineo.com) + * Yaroslav Vinogradov yaroslav.vinogradov@freescale.com + * Copyright Freescale Semiconductor, Inc 2006 + * Copyright (c) 2006, emlix, Sebastian Hess + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +/***************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/***************************************************************************/ + +DEFINE_CLK(0, "flexbus", 2, MCF_CLK); +DEFINE_CLK(0, "mcfcan.0", 8, MCF_CLK); +DEFINE_CLK(0, "fec.0", 12, MCF_CLK); +DEFINE_CLK(0, "edma", 17, MCF_CLK); +DEFINE_CLK(0, "intc.0", 18, MCF_CLK); +DEFINE_CLK(0, "intc.1", 19, MCF_CLK); +DEFINE_CLK(0, "iack.0", 21, MCF_CLK); +DEFINE_CLK(0, "mcfi2c.0", 22, MCF_CLK); +DEFINE_CLK(0, "mcfqspi.0", 23, MCF_CLK); +DEFINE_CLK(0, "mcfuart.0", 24, MCF_BUSCLK); +DEFINE_CLK(0, "mcfuart.1", 25, MCF_BUSCLK); +DEFINE_CLK(0, "mcfuart.2", 26, MCF_BUSCLK); +DEFINE_CLK(0, "mcftmr.0", 28, MCF_CLK); +DEFINE_CLK(0, "mcftmr.1", 29, MCF_CLK); +DEFINE_CLK(0, "mcftmr.2", 30, MCF_CLK); +DEFINE_CLK(0, "mcftmr.3", 31, MCF_CLK); + +DEFINE_CLK(0, "mcfpit.0", 32, MCF_CLK); +DEFINE_CLK(0, "mcfpit.1", 33, MCF_CLK); +DEFINE_CLK(0, "mcfpit.2", 34, MCF_CLK); +DEFINE_CLK(0, "mcfpit.3", 35, MCF_CLK); +DEFINE_CLK(0, "mcfpwm.0", 36, MCF_CLK); +DEFINE_CLK(0, "mcfeport.0", 37, MCF_CLK); +DEFINE_CLK(0, "mcfwdt.0", 38, MCF_CLK); +DEFINE_CLK(0, "sys.0", 40, MCF_BUSCLK); +DEFINE_CLK(0, "gpio.0", 41, MCF_BUSCLK); +DEFINE_CLK(0, "mcfrtc.0", 42, MCF_CLK); +DEFINE_CLK(0, "mcflcd.0", 43, MCF_CLK); +DEFINE_CLK(0, "mcfusb-otg.0", 44, MCF_CLK); +DEFINE_CLK(0, "mcfusb-host.0", 45, MCF_CLK); +DEFINE_CLK(0, "sdram.0", 46, MCF_CLK); +DEFINE_CLK(0, "ssi.0", 47, MCF_CLK); +DEFINE_CLK(0, "pll.0", 48, MCF_CLK); + +DEFINE_CLK(1, "mdha.0", 32, MCF_CLK); +DEFINE_CLK(1, "skha.0", 33, MCF_CLK); +DEFINE_CLK(1, "rng.0", 34, MCF_CLK); + +struct clk *mcf_clks[] = { + &__clk_0_2, /* flexbus */ + &__clk_0_8, /* mcfcan.0 */ + &__clk_0_12, /* fec.0 */ + &__clk_0_17, /* edma */ + &__clk_0_18, /* intc.0 */ + &__clk_0_19, /* intc.1 */ + &__clk_0_21, /* iack.0 */ + &__clk_0_22, /* mcfi2c.0 */ + &__clk_0_23, /* mcfqspi.0 */ + &__clk_0_24, /* mcfuart.0 */ + &__clk_0_25, /* mcfuart.1 */ + &__clk_0_26, /* mcfuart.2 */ + &__clk_0_28, /* mcftmr.0 */ + &__clk_0_29, /* mcftmr.1 */ + &__clk_0_30, /* mcftmr.2 */ + &__clk_0_31, /* mcftmr.3 */ + + &__clk_0_32, /* mcfpit.0 */ + &__clk_0_33, /* mcfpit.1 */ + &__clk_0_34, /* mcfpit.2 */ + &__clk_0_35, /* mcfpit.3 */ + &__clk_0_36, /* mcfpwm.0 */ + &__clk_0_37, /* mcfeport.0 */ + &__clk_0_38, /* mcfwdt.0 */ + &__clk_0_40, /* sys.0 */ + &__clk_0_41, /* gpio.0 */ + &__clk_0_42, /* mcfrtc.0 */ + &__clk_0_43, /* mcflcd.0 */ + &__clk_0_44, /* mcfusb-otg.0 */ + &__clk_0_45, /* mcfusb-host.0 */ + &__clk_0_46, /* sdram.0 */ + &__clk_0_47, /* ssi.0 */ + &__clk_0_48, /* pll.0 */ + + &__clk_1_32, /* mdha.0 */ + &__clk_1_33, /* skha.0 */ + &__clk_1_34, /* rng.0 */ + NULL, +}; + +static struct clk * const enable_clks[] __initconst = { + &__clk_0_2, /* flexbus */ + &__clk_0_18, /* intc.0 */ + &__clk_0_19, /* intc.1 */ + &__clk_0_21, /* iack.0 */ + &__clk_0_24, /* mcfuart.0 */ + &__clk_0_25, /* mcfuart.1 */ + &__clk_0_26, /* mcfuart.2 */ + &__clk_0_28, /* mcftmr.0 */ + &__clk_0_29, /* mcftmr.1 */ + &__clk_0_32, /* mcfpit.0 */ + &__clk_0_33, /* mcfpit.1 */ + &__clk_0_37, /* mcfeport.0 */ + &__clk_0_40, /* sys.0 */ + &__clk_0_41, /* gpio.0 */ + &__clk_0_46, /* sdram.0 */ + &__clk_0_48, /* pll.0 */ +}; + +static struct clk * const disable_clks[] __initconst = { + &__clk_0_8, /* mcfcan.0 */ + &__clk_0_12, /* fec.0 */ + &__clk_0_17, /* edma */ + &__clk_0_22, /* mcfi2c.0 */ + &__clk_0_23, /* mcfqspi.0 */ + &__clk_0_30, /* mcftmr.2 */ + &__clk_0_31, /* mcftmr.3 */ + &__clk_0_34, /* mcfpit.2 */ + &__clk_0_35, /* mcfpit.3 */ + &__clk_0_36, /* mcfpwm.0 */ + &__clk_0_38, /* mcfwdt.0 */ + &__clk_0_42, /* mcfrtc.0 */ + &__clk_0_43, /* mcflcd.0 */ + &__clk_0_44, /* mcfusb-otg.0 */ + &__clk_0_45, /* mcfusb-host.0 */ + &__clk_0_47, /* ssi.0 */ + &__clk_1_32, /* mdha.0 */ + &__clk_1_33, /* skha.0 */ + &__clk_1_34, /* rng.0 */ +}; + + +static void __init m53xx_clk_init(void) +{ + unsigned i; + + /* make sure these clocks are enabled */ + for (i = 0; i < ARRAY_SIZE(enable_clks); ++i) + __clk_init_enabled(enable_clks[i]); + /* make sure these clocks are disabled */ + for (i = 0; i < ARRAY_SIZE(disable_clks); ++i) + __clk_init_disabled(disable_clks[i]); +} + +/***************************************************************************/ + +#if IS_ENABLED(CONFIG_SPI_COLDFIRE_QSPI) + +static void __init m53xx_qspi_init(void) +{ + /* setup QSPS pins for QSPI with gpio CS control */ + writew(0x01f0, MCFGPIO_PAR_QSPI); +} + +#endif /* IS_ENABLED(CONFIG_SPI_COLDFIRE_QSPI) */ + +/***************************************************************************/ + +static void __init m53xx_uarts_init(void) +{ + /* UART GPIO initialization */ + writew(readw(MCFGPIO_PAR_UART) | 0x0FFF, MCFGPIO_PAR_UART); +} + +/***************************************************************************/ + +static void __init m53xx_fec_init(void) +{ + u8 v; + + /* Set multi-function pins to ethernet mode for fec0 */ + v = readb(MCFGPIO_PAR_FECI2C); + v |= MCF_GPIO_PAR_FECI2C_PAR_MDC_EMDC | + MCF_GPIO_PAR_FECI2C_PAR_MDIO_EMDIO; + writeb(v, MCFGPIO_PAR_FECI2C); + + v = readb(MCFGPIO_PAR_FEC); + v = MCF_GPIO_PAR_FEC_PAR_FEC_7W_FEC | MCF_GPIO_PAR_FEC_PAR_FEC_MII_FEC; + writeb(v, MCFGPIO_PAR_FEC); +} + +/***************************************************************************/ + +void __init config_BSP(char *commandp, int size) +{ +#if !defined(CONFIG_BOOTPARAM) + /* Copy command line from FLASH to local buffer... */ + memcpy(commandp, (char *) 0x4000, 4); + if(strncmp(commandp, "kcl ", 4) == 0){ + memcpy(commandp, (char *) 0x4004, size); + commandp[size-1] = 0; + } else { + memset(commandp, 0, size); + } +#endif + mach_sched_init = hw_timer_init; + m53xx_clk_init(); + m53xx_uarts_init(); + m53xx_fec_init(); +#if IS_ENABLED(CONFIG_SPI_COLDFIRE_QSPI) + m53xx_qspi_init(); +#endif + +#ifdef CONFIG_BDM_DISABLE + /* + * Disable the BDM clocking. This also turns off most of the rest of + * the BDM device. This is good for EMC reasons. This option is not + * incompatible with the memory protection option. + */ + wdebug(MCFDEBUG_CSR, MCFDEBUG_CSR_PSTCLK); +#endif +} + +/***************************************************************************/ +/* Board initialization */ +/***************************************************************************/ +/* + * PLL min/max specifications + */ +#define MAX_FVCO 500000 /* KHz */ +#define MAX_FSYS 80000 /* KHz */ +#define MIN_FSYS 58333 /* KHz */ +#define FREF 16000 /* KHz */ + + +#define MAX_MFD 135 /* Multiplier */ +#define MIN_MFD 88 /* Multiplier */ +#define BUSDIV 6 /* Divider */ + +/* + * Low Power Divider specifications + */ +#define MIN_LPD (1 << 0) /* Divider (not encoded) */ +#define MAX_LPD (1 << 15) /* Divider (not encoded) */ +#define DEFAULT_LPD (1 << 1) /* Divider (not encoded) */ + +#define SYS_CLK_KHZ 80000 +#define SYSTEM_PERIOD 12.5 +/* + * SDRAM Timing Parameters + */ +#define SDRAM_BL 8 /* # of beats in a burst */ +#define SDRAM_TWR 2 /* in clocks */ +#define SDRAM_CASL 2.5 /* CASL in clocks */ +#define SDRAM_TRCD 2 /* in clocks */ +#define SDRAM_TRP 2 /* in clocks */ +#define SDRAM_TRFC 7 /* in clocks */ +#define SDRAM_TREFI 7800 /* in ns */ + +#define EXT_SRAM_ADDRESS (0xC0000000) +#define FLASH_ADDRESS (0x00000000) +#define SDRAM_ADDRESS (0x40000000) + +#define NAND_FLASH_ADDRESS (0xD0000000) + +int sys_clk_khz = 0; +int sys_clk_mhz = 0; + +void wtm_init(void); +void scm_init(void); +void gpio_init(void); +void fbcs_init(void); +void sdramc_init(void); +int clock_pll (int fsys, int flags); +int clock_limp (int); +int clock_exit_limp (void); +int get_sys_clock (void); + +asmlinkage void __init sysinit(void) +{ + sys_clk_khz = clock_pll(0, 0); + sys_clk_mhz = sys_clk_khz/1000; + + wtm_init(); + scm_init(); + gpio_init(); + fbcs_init(); + sdramc_init(); +} + +void wtm_init(void) +{ + /* Disable watchdog timer */ + writew(0, MCF_WTM_WCR); +} + +#define MCF_SCM_BCR_GBW (0x00000100) +#define MCF_SCM_BCR_GBR (0x00000200) + +void scm_init(void) +{ + /* All masters are trusted */ + writel(0x77777777, MCF_SCM_MPR); + + /* Allow supervisor/user, read/write, and trusted/untrusted + access to all slaves */ + writel(0, MCF_SCM_PACRA); + writel(0, MCF_SCM_PACRB); + writel(0, MCF_SCM_PACRC); + writel(0, MCF_SCM_PACRD); + writel(0, MCF_SCM_PACRE); + writel(0, MCF_SCM_PACRF); + + /* Enable bursts */ + writel(MCF_SCM_BCR_GBR | MCF_SCM_BCR_GBW, MCF_SCM_BCR); +} + + +void fbcs_init(void) +{ + writeb(0x3E, MCFGPIO_PAR_CS); + + /* Latch chip select */ + writel(0x10080000, MCF_FBCS1_CSAR); + + writel(0x002A3780, MCF_FBCS1_CSCR); + writel(MCF_FBCS_CSMR_BAM_2M | MCF_FBCS_CSMR_V, MCF_FBCS1_CSMR); + + /* Initialize latch to drive signals to inactive states */ + writew(0xffff, 0x10080000); + + /* External SRAM */ + writel(EXT_SRAM_ADDRESS, MCF_FBCS1_CSAR); + writel(MCF_FBCS_CSCR_PS_16 | + MCF_FBCS_CSCR_AA | + MCF_FBCS_CSCR_SBM | + MCF_FBCS_CSCR_WS(1), + MCF_FBCS1_CSCR); + writel(MCF_FBCS_CSMR_BAM_512K | MCF_FBCS_CSMR_V, MCF_FBCS1_CSMR); + + /* Boot Flash connected to FBCS0 */ + writel(FLASH_ADDRESS, MCF_FBCS0_CSAR); + writel(MCF_FBCS_CSCR_PS_16 | + MCF_FBCS_CSCR_BEM | + MCF_FBCS_CSCR_AA | + MCF_FBCS_CSCR_SBM | + MCF_FBCS_CSCR_WS(7), + MCF_FBCS0_CSCR); + writel(MCF_FBCS_CSMR_BAM_32M | MCF_FBCS_CSMR_V, MCF_FBCS0_CSMR); +} + +void sdramc_init(void) +{ + /* + * Check to see if the SDRAM has already been initialized + * by a run control tool + */ + if (!(readl(MCF_SDRAMC_SDCR) & MCF_SDRAMC_SDCR_REF)) { + /* SDRAM chip select initialization */ + + /* Initialize SDRAM chip select */ + writel(MCF_SDRAMC_SDCS_BA(SDRAM_ADDRESS) | + MCF_SDRAMC_SDCS_CSSZ(MCF_SDRAMC_SDCS_CSSZ_32MBYTE), + MCF_SDRAMC_SDCS0); + + /* + * Basic configuration and initialization + */ + writel(MCF_SDRAMC_SDCFG1_SRD2RW((int)((SDRAM_CASL + 2) + 0.5)) | + MCF_SDRAMC_SDCFG1_SWT2RD(SDRAM_TWR + 1) | + MCF_SDRAMC_SDCFG1_RDLAT((int)((SDRAM_CASL * 2) + 2)) | + MCF_SDRAMC_SDCFG1_ACT2RW((int)(SDRAM_TRCD + 0.5)) | + MCF_SDRAMC_SDCFG1_PRE2ACT((int)(SDRAM_TRP + 0.5)) | + MCF_SDRAMC_SDCFG1_REF2ACT((int)(SDRAM_TRFC + 0.5)) | + MCF_SDRAMC_SDCFG1_WTLAT(3), + MCF_SDRAMC_SDCFG1); + writel(MCF_SDRAMC_SDCFG2_BRD2PRE(SDRAM_BL / 2 + 1) | + MCF_SDRAMC_SDCFG2_BWT2RW(SDRAM_BL / 2 + SDRAM_TWR) | + MCF_SDRAMC_SDCFG2_BRD2WT((int)((SDRAM_CASL + SDRAM_BL / 2 - 1.0) + 0.5)) | + MCF_SDRAMC_SDCFG2_BL(SDRAM_BL - 1), + MCF_SDRAMC_SDCFG2); + + + /* + * Precharge and enable write to SDMR + */ + writel(MCF_SDRAMC_SDCR_MODE_EN | + MCF_SDRAMC_SDCR_CKE | + MCF_SDRAMC_SDCR_DDR | + MCF_SDRAMC_SDCR_MUX(1) | + MCF_SDRAMC_SDCR_RCNT((int)(((SDRAM_TREFI / (SYSTEM_PERIOD * 64)) - 1) + 0.5)) | + MCF_SDRAMC_SDCR_PS_16 | + MCF_SDRAMC_SDCR_IPALL, + MCF_SDRAMC_SDCR); + + /* + * Write extended mode register + */ + writel(MCF_SDRAMC_SDMR_BNKAD_LEMR | + MCF_SDRAMC_SDMR_AD(0x0) | + MCF_SDRAMC_SDMR_CMD, + MCF_SDRAMC_SDMR); + + /* + * Write mode register and reset DLL + */ + writel(MCF_SDRAMC_SDMR_BNKAD_LMR | + MCF_SDRAMC_SDMR_AD(0x163) | + MCF_SDRAMC_SDMR_CMD, + MCF_SDRAMC_SDMR); + + /* + * Execute a PALL command + */ + writel(readl(MCF_SDRAMC_SDCR) | MCF_SDRAMC_SDCR_IPALL, MCF_SDRAMC_SDCR); + + /* + * Perform two REF cycles + */ + writel(readl(MCF_SDRAMC_SDCR) | MCF_SDRAMC_SDCR_IREF, MCF_SDRAMC_SDCR); + writel(readl(MCF_SDRAMC_SDCR) | MCF_SDRAMC_SDCR_IREF, MCF_SDRAMC_SDCR); + + /* + * Write mode register and clear reset DLL + */ + writel(MCF_SDRAMC_SDMR_BNKAD_LMR | + MCF_SDRAMC_SDMR_AD(0x063) | + MCF_SDRAMC_SDMR_CMD, + MCF_SDRAMC_SDMR); + + /* + * Enable auto refresh and lock SDMR + */ + writel(readl(MCF_SDRAMC_SDCR) & ~MCF_SDRAMC_SDCR_MODE_EN, + MCF_SDRAMC_SDCR); + writel(MCF_SDRAMC_SDCR_REF | MCF_SDRAMC_SDCR_DQS_OE(0xC), + MCF_SDRAMC_SDCR); + } +} + +void gpio_init(void) +{ + /* Enable UART0 pins */ + writew(MCF_GPIO_PAR_UART_PAR_URXD0 | MCF_GPIO_PAR_UART_PAR_UTXD0, + MCFGPIO_PAR_UART); + + /* + * Initialize TIN3 as a GPIO output to enable the write + * half of the latch. + */ + writeb(0x00, MCFGPIO_PAR_TIMER); + writeb(0x08, MCFGPIO_PDDR_TIMER); + writeb(0x00, MCFGPIO_PCLRR_TIMER); +} + +int clock_pll(int fsys, int flags) +{ + int fref, temp, fout, mfd; + u32 i; + + fref = FREF; + + if (fsys == 0) { + /* Return current PLL output */ + mfd = readb(MCF_PLL_PFDR); + + return (fref * mfd / (BUSDIV * 4)); + } + + /* Check bounds of requested system clock */ + if (fsys > MAX_FSYS) + fsys = MAX_FSYS; + if (fsys < MIN_FSYS) + fsys = MIN_FSYS; + + /* Multiplying by 100 when calculating the temp value, + and then dividing by 100 to calculate the mfd allows + for exact values without needing to include floating + point libraries. */ + temp = 100 * fsys / fref; + mfd = 4 * BUSDIV * temp / 100; + + /* Determine the output frequency for selected values */ + fout = (fref * mfd / (BUSDIV * 4)); + + /* + * Check to see if the SDRAM has already been initialized. + * If it has then the SDRAM needs to be put into self refresh + * mode before reprogramming the PLL. + */ + if (readl(MCF_SDRAMC_SDCR) & MCF_SDRAMC_SDCR_REF) + /* Put SDRAM into self refresh mode */ + writel(readl(MCF_SDRAMC_SDCR) & ~MCF_SDRAMC_SDCR_CKE, + MCF_SDRAMC_SDCR); + + /* + * Initialize the PLL to generate the new system clock frequency. + * The device must be put into LIMP mode to reprogram the PLL. + */ + + /* Enter LIMP mode */ + clock_limp(DEFAULT_LPD); + + /* Reprogram PLL for desired fsys */ + writeb(MCF_PLL_PODR_CPUDIV(BUSDIV/3) | MCF_PLL_PODR_BUSDIV(BUSDIV), + MCF_PLL_PODR); + + writeb(mfd, MCF_PLL_PFDR); + + /* Exit LIMP mode */ + clock_exit_limp(); + + /* + * Return the SDRAM to normal operation if it is in use. + */ + if (readl(MCF_SDRAMC_SDCR) & MCF_SDRAMC_SDCR_REF) + /* Exit self refresh mode */ + writel(readl(MCF_SDRAMC_SDCR) | MCF_SDRAMC_SDCR_CKE, + MCF_SDRAMC_SDCR); + + /* Errata - workaround for SDRAM opeartion after exiting LIMP mode */ + writel(MCF_SDRAMC_REFRESH, MCF_SDRAMC_LIMP_FIX); + + /* wait for DQS logic to relock */ + for (i = 0; i < 0x200; i++) + ; + + return fout; +} + +int clock_limp(int div) +{ + u32 temp; + + /* Check bounds of divider */ + if (div < MIN_LPD) + div = MIN_LPD; + if (div > MAX_LPD) + div = MAX_LPD; + + /* Save of the current value of the SSIDIV so we don't + overwrite the value*/ + temp = readw(MCF_CCM_CDR) & MCF_CCM_CDR_SSIDIV(0xF); + + /* Apply the divider to the system clock */ + writew(MCF_CCM_CDR_LPDIV(div) | MCF_CCM_CDR_SSIDIV(temp), MCF_CCM_CDR); + + writew(readw(MCF_CCM_MISCCR) | MCF_CCM_MISCCR_LIMP, MCF_CCM_MISCCR); + + return (FREF/(3*(1 << div))); +} + +int clock_exit_limp(void) +{ + int fout; + + /* Exit LIMP mode */ + writew(readw(MCF_CCM_MISCCR) & ~MCF_CCM_MISCCR_LIMP, MCF_CCM_MISCCR); + + /* Wait for PLL to lock */ + while (!(readw(MCF_CCM_MISCCR) & MCF_CCM_MISCCR_PLL_LOCK)) + ; + + fout = get_sys_clock(); + + return fout; +} + +int get_sys_clock(void) +{ + int divider; + + /* Test to see if device is in LIMP mode */ + if (readw(MCF_CCM_MISCCR) & MCF_CCM_MISCCR_LIMP) { + divider = readw(MCF_CCM_CDR) & MCF_CCM_CDR_LPDIV(0xF); + return (FREF/(2 << divider)); + } + else + return (FREF * readb(MCF_PLL_PFDR)) / (BUSDIV * 4); +} diff --git a/arch/m68k/platform/coldfire/timers.c b/arch/m68k/platform/coldfire/timers.c index 51f6d2af807f..d06068e45764 100644 --- a/arch/m68k/platform/coldfire/timers.c +++ b/arch/m68k/platform/coldfire/timers.c @@ -36,7 +36,7 @@ */ void coldfire_profile_init(void); -#if defined(CONFIG_M532x) || defined(CONFIG_M5441x) +#if defined(CONFIG_M53xx) || defined(CONFIG_M5441x) #define __raw_readtrr __raw_readl #define __raw_writetrr __raw_writel #else diff --git a/arch/metag/Kconfig b/arch/metag/Kconfig index 6f16c1469327..dcd94406030e 100644 --- a/arch/metag/Kconfig +++ b/arch/metag/Kconfig @@ -52,9 +52,6 @@ config GENERIC_HWEIGHT config GENERIC_CALIBRATE_DELAY def_bool y -config GENERIC_GPIO - def_bool n - config NO_IOPORT def_bool y diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig index 54237af0b07c..d22a4ecffff4 100644 --- a/arch/microblaze/Kconfig +++ b/arch/microblaze/Kconfig @@ -54,9 +54,6 @@ config GENERIC_HWEIGHT config GENERIC_CALIBRATE_DELAY def_bool y -config GENERIC_GPIO - bool - config GENERIC_CSUM def_bool y diff --git a/arch/microblaze/configs/mmu_defconfig b/arch/microblaze/configs/mmu_defconfig index d2b097a652d9..3649a8b150c0 100644 --- a/arch/microblaze/configs/mmu_defconfig +++ b/arch/microblaze/configs/mmu_defconfig @@ -17,7 +17,6 @@ CONFIG_MODULE_UNLOAD=y # CONFIG_BLK_DEV_BSG is not set CONFIG_PARTITION_ADVANCED=y # CONFIG_EFI_PARTITION is not set -CONFIG_OPT_LIB_ASM=y CONFIG_XILINX_MICROBLAZE0_USE_MSR_INSTR=1 CONFIG_XILINX_MICROBLAZE0_USE_PCMP_INSTR=1 CONFIG_XILINX_MICROBLAZE0_USE_BARREL=1 diff --git a/arch/microblaze/include/asm/pci.h b/arch/microblaze/include/asm/pci.h index 41cc841091b0..d52abb6812fa 100644 --- a/arch/microblaze/include/asm/pci.h +++ b/arch/microblaze/include/asm/pci.h @@ -153,7 +153,5 @@ extern void __init xilinx_pci_init(void); static inline void __init xilinx_pci_init(void) { return; } #endif -#include - #endif /* __KERNEL__ */ #endif /* __ASM_MICROBLAZE_PCI_H */ diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h index a1ab5f0009ef..efe59d881789 100644 --- a/arch/microblaze/include/asm/uaccess.h +++ b/arch/microblaze/include/asm/uaccess.h @@ -90,17 +90,25 @@ static inline int ___range_ok(unsigned long addr, unsigned long size) #else -/* - * Address is valid if: - * - "addr", "addr + size" and "size" are all below the limit - */ -#define access_ok(type, addr, size) \ - (get_fs().seg >= (((unsigned long)(addr)) | \ - (size) | ((unsigned long)(addr) + (size)))) - -/* || printk("access_ok failed for %s at 0x%08lx (size %d), seg 0x%08x\n", - type?"WRITE":"READ",addr,size,get_fs().seg)) */ - +static inline int access_ok(int type, const void __user *addr, + unsigned long size) +{ + if (!size) + goto ok; + + if ((get_fs().seg < ((unsigned long)addr)) || + (get_fs().seg < ((unsigned long)addr + size - 1))) { + pr_debug("ACCESS fail: %s at 0x%08x (size 0x%x), seg 0x%08x\n", + type ? "WRITE" : "READ ", (u32)addr, (u32)size, + (u32)get_fs().seg); + return 0; + } +ok: + pr_debug("ACCESS OK: %s at 0x%08x (size 0x%x), seg 0x%08x\n", + type ? "WRITE" : "READ ", (u32)addr, (u32)size, + (u32)get_fs().seg); + return 1; +} #endif #ifdef CONFIG_MMU diff --git a/arch/microblaze/kernel/cpu/cpuinfo.c b/arch/microblaze/kernel/cpu/cpuinfo.c index 0b2299bcb948..410398f6db55 100644 --- a/arch/microblaze/kernel/cpu/cpuinfo.c +++ b/arch/microblaze/kernel/cpu/cpuinfo.c @@ -37,6 +37,8 @@ const struct cpu_ver_key cpu_ver_lookup[] = { {"8.20.a", 0x15}, {"8.20.b", 0x16}, {"8.30.a", 0x17}, + {"8.40.a", 0x18}, + {"8.40.b", 0x19}, {NULL, 0}, }; @@ -57,6 +59,9 @@ const struct family_string_key family_string_lookup[] = { {"virtex6", 0xe}, /* FIXME There is no key code defined for spartan2 */ {"spartan2", 0xf0}, + {"kintex7", 0x10}, + {"artix7", 0x11}, + {"zynq7000", 0x12}, {NULL, 0}, }; diff --git a/arch/microblaze/kernel/head.S b/arch/microblaze/kernel/head.S index eef84de5e8c8..fcc797feb9db 100644 --- a/arch/microblaze/kernel/head.S +++ b/arch/microblaze/kernel/head.S @@ -112,16 +112,16 @@ no_fdt_arg: * copy command line directly to cmd_line placed in data section. */ beqid r5, skip /* Skip if NULL pointer */ - or r6, r0, r0 /* incremment */ + or r11, r0, r0 /* incremment */ ori r4, r0, cmd_line /* load address of command line */ tophys(r4,r4) /* convert to phys address */ ori r3, r0, COMMAND_LINE_SIZE - 1 /* number of loops */ _copy_command_line: /* r2=r5+r6 - r5 contain pointer to command line */ - lbu r2, r5, r6 + lbu r2, r5, r11 beqid r2, skip /* Skip if no data */ - sb r2, r4, r6 /* addr[r4+r6]= r2*/ - addik r6, r6, 1 /* increment counting */ + sb r2, r4, r11 /* addr[r4+r6]= r2 */ + addik r11, r11, 1 /* increment counting */ bgtid r3, _copy_command_line /* loop for all entries */ addik r3, r3, -1 /* decrement loop */ addik r5, r4, 0 /* add new space for command line */ @@ -131,13 +131,13 @@ skip: #ifdef NOT_COMPILE /* save bram context */ - or r6, r0, r0 /* incremment */ + or r11, r0, r0 /* incremment */ ori r4, r0, TOPHYS(_bram_load_start) /* save bram context */ ori r3, r0, (LMB_SIZE - 4) _copy_bram: - lw r7, r0, r6 /* r7 = r0 + r6 */ - sw r7, r4, r6 /* addr[r4 + r6] = r7*/ - addik r6, r6, 4 /* increment counting */ + lw r7, r0, r11 /* r7 = r0 + r6 */ + sw r7, r4, r11 /* addr[r4 + r6] = r7 */ + addik r11, r11, 4 /* increment counting */ bgtid r3, _copy_bram /* loop for all entries */ addik r3, r3, -4 /* descrement loop */ #endif @@ -303,8 +303,8 @@ jump_over2: * the exception vectors, using a 4k real==virtual mapping. */ /* Use temporary TLB_ID for LMB - clear this temporary mapping later */ - ori r6, r0, MICROBLAZE_LMB_TLB_ID - mts rtlbx,r6 + ori r11, r0, MICROBLAZE_LMB_TLB_ID + mts rtlbx,r11 ori r4,r0,(TLB_WR | TLB_EX) ori r3,r0,(TLB_VALID | TLB_PAGESZ(PAGESZ_4K)) diff --git a/arch/microblaze/kernel/intc.c b/arch/microblaze/kernel/intc.c index 8778adf72bd3..d85fa3a2b0f8 100644 --- a/arch/microblaze/kernel/intc.c +++ b/arch/microblaze/kernel/intc.c @@ -172,4 +172,6 @@ void __init init_IRQ(void) * and commits this patch. ~~gcl */ root_domain = irq_domain_add_linear(intc, nr_irq, &xintc_irq_domain_ops, (void *)intr_mask); + + irq_set_default_host(root_domain); } diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c index a55893807274..7d1a9c8b1f3d 100644 --- a/arch/microblaze/kernel/process.c +++ b/arch/microblaze/kernel/process.c @@ -160,3 +160,8 @@ int dump_fpu(struct pt_regs *regs, elf_fpregset_t *fpregs) return 0; /* MicroBlaze has no separate FPU registers */ } #endif /* CONFIG_MMU */ + +void arch_cpu_idle(void) +{ + local_irq_enable(); +} diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c index 4ec137d13ad7..b38ae3acfeb4 100644 --- a/arch/microblaze/mm/init.c +++ b/arch/microblaze/mm/init.c @@ -404,10 +404,11 @@ asmlinkage void __init mmu_init(void) #if defined(CONFIG_BLK_DEV_INITRD) /* Remove the init RAM disk from the available memory. */ -/* if (initrd_start) { - mem_pieces_remove(&phys_avail, __pa(initrd_start), - initrd_end - initrd_start, 1); - }*/ + if (initrd_start) { + unsigned long size; + size = initrd_end - initrd_start; + memblock_reserve(virt_to_phys(initrd_start), size); + } #endif /* CONFIG_BLK_DEV_INITRD */ /* Initialize the MMU hardware */ diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index 9ea521e4959e..bdb8ea100e73 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -30,7 +30,6 @@ #include #include #include -#include #include #include diff --git a/arch/mips/Kbuild b/arch/mips/Kbuild index 7dd65cfae837..d2cfe45f332b 100644 --- a/arch/mips/Kbuild +++ b/arch/mips/Kbuild @@ -17,3 +17,7 @@ obj- := $(platform-) obj-y += kernel/ obj-y += mm/ obj-y += math-emu/ + +ifdef CONFIG_KVM +obj-y += kvm/ +endif diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index e5f3794744f1..7a58ab933b20 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -61,8 +61,7 @@ config MIPS_ALCHEMY select SYS_HAS_CPU_MIPS32_R1 select SYS_SUPPORTS_32BIT_KERNEL select SYS_SUPPORTS_APM_EMULATION - select GENERIC_GPIO - select ARCH_WANT_OPTIONAL_GPIOLIB + select ARCH_REQUIRE_GPIOLIB select SYS_SUPPORTS_ZBOOT select USB_ARCH_HAS_OHCI select USB_ARCH_HAS_EHCI @@ -225,7 +224,6 @@ config MACH_JZ4740 select SYS_SUPPORTS_ZBOOT_UART16550 select DMA_NONCOHERENT select IRQ_CPU - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB select SYS_HAS_EARLY_PRINTK select HAVE_PWM @@ -306,7 +304,6 @@ config MIPS_MALTA select HW_HAS_PCI select I8253 select I8259 - select MIPS_BOARDS_GEN select MIPS_BONITO64 select MIPS_CPU_SCACHE select PCI_GT64XXX_PCI0 @@ -337,12 +334,12 @@ config MIPS_SEAD3 select BOOT_RAW select CEVT_R4K select CSRC_R4K + select CSRC_GIC select CPU_MIPSR2_IRQ_VI select CPU_MIPSR2_IRQ_EI select DMA_NONCOHERENT select IRQ_CPU select IRQ_GIC - select MIPS_BOARDS_GEN select MIPS_CPU_SCACHE select MIPS_MSC select SYS_HAS_CPU_MIPS32_R1 @@ -354,6 +351,7 @@ config MIPS_SEAD3 select SYS_SUPPORTS_BIG_ENDIAN select SYS_SUPPORTS_LITTLE_ENDIAN select SYS_SUPPORTS_SMARTMIPS + select SYS_SUPPORTS_MICROMIPS select USB_ARCH_HAS_EHCI select USB_EHCI_BIG_ENDIAN_DESC select USB_EHCI_BIG_ENDIAN_MMIO @@ -912,6 +910,9 @@ config CEVT_GT641XX config CEVT_R4K bool +config CEVT_GIC + bool + config CEVT_SB1250 bool @@ -937,7 +938,6 @@ config CSRC_SB1250 bool config GPIO_TXX9 - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB bool @@ -985,9 +985,6 @@ config MIPS_MSC config MIPS_NILE4 bool -config MIPS_DISABLE_OBSOLETE_IDE - bool - config SYNC_R4K bool @@ -1009,9 +1006,6 @@ config GENERIC_ISA_DMA_SUPPORT_BROKEN config ISA_DMA_API bool -config GENERIC_GPIO - bool - config HOLES_IN_ZONE bool @@ -1081,9 +1075,6 @@ config IRQ_GT641XX config IRQ_GIC bool -config MIPS_BOARDS_GEN - bool - config PCI_GT64XXX_PCI0 bool @@ -1112,7 +1103,6 @@ config SOC_PNX833X select SYS_SUPPORTS_32BIT_KERNEL select SYS_SUPPORTS_LITTLE_ENDIAN select SYS_SUPPORTS_BIG_ENDIAN - select GENERIC_GPIO select CPU_MIPSR2_IRQ_VI config SOC_PNX8335 @@ -1154,7 +1144,7 @@ config BOOT_ELF32 config MIPS_L1_CACHE_SHIFT int - default "4" if MACH_DECSTATION || MIKROTIK_RB532 || PMC_MSP4200_EVAL + default "4" if MACH_DECSTATION || MIKROTIK_RB532 || PMC_MSP4200_EVAL || SOC_RT288X default "6" if MIPS_CPU_SCACHE default "7" if SGI_IP22 || SGI_IP27 || SGI_IP28 || SNI_RM || CPU_CAVIUM_OCTEON default "5" @@ -1203,7 +1193,6 @@ config CPU_LOONGSON2F bool "Loongson 2F" depends on SYS_HAS_CPU_LOONGSON2F select CPU_LOONGSON2 - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB help The Loongson 2F processor implements the MIPS III instruction set @@ -1244,6 +1233,7 @@ config CPU_MIPS32_R2 select CPU_HAS_PREFETCH select CPU_SUPPORTS_32BIT_KERNEL select CPU_SUPPORTS_HIGHMEM + select HAVE_KVM help Choose this option to build a kernel for release 2 or later of the MIPS32 architecture. Most modern embedded systems with a 32-bit @@ -1744,6 +1734,20 @@ config 64BIT endchoice +config KVM_GUEST + bool "KVM Guest Kernel" + help + Select this option if building a guest kernel for KVM (Trap & Emulate) mode + +config KVM_HOST_FREQ + int "KVM Host Processor Frequency (MHz)" + depends on KVM_GUEST + default 500 + help + Select this option if building a guest kernel for KVM to skip + RTC emulation when determining guest CPU Frequency. Instead, the guest + processor frequency is automatically derived from the host frequency. + choice prompt "Kernel page size" default PAGE_SIZE_4KB @@ -1819,6 +1823,15 @@ config FORCE_MAX_ZONEORDER The page size is not necessarily 4KB. Keep this in mind when choosing a value for this option. +config CEVT_GIC + bool "Use GIC global counter for clock events" + depends on IRQ_GIC && !(MIPS_SEAD3 || MIPS_MT_SMTC) + help + Use the GIC global counter for the clock events. The R4K clock + event driver is always present, so if the platform ends up not + detecting a GIC, it will fall back to the R4K timer for the + generation of clock events. + config BOARD_SCACHE bool @@ -2024,6 +2037,7 @@ config SB1_PASS_2_1_WORKAROUNDS depends on CPU_SB1 && CPU_SB1_PASS_2 default y + config 64BIT_PHYS_ADDR bool @@ -2042,6 +2056,13 @@ config CPU_HAS_SMARTMIPS you don't know you probably don't have SmartMIPS and should say N here. +config CPU_MICROMIPS + depends on SYS_SUPPORTS_MICROMIPS + bool "Build kernel using microMIPS ISA" + help + When this option is enabled the kernel will be built using the + microMIPS ISA + config CPU_HAS_WB bool @@ -2104,6 +2125,9 @@ config SYS_SUPPORTS_HIGHMEM config SYS_SUPPORTS_SMARTMIPS bool +config SYS_SUPPORTS_MICROMIPS + bool + config ARCH_FLATMEM_ENABLE def_bool y depends on !NUMA && !CPU_LOONGSON2 @@ -2564,3 +2588,5 @@ source "security/Kconfig" source "crypto/Kconfig" source "lib/Kconfig" + +source "arch/mips/kvm/Kconfig" diff --git a/arch/mips/Makefile b/arch/mips/Makefile index 6f7978f95090..dd58a04ef4bc 100644 --- a/arch/mips/Makefile +++ b/arch/mips/Makefile @@ -114,6 +114,7 @@ cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(shell $(CC) -dumpmachine |grep -q 'mips.*e cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += $(shell $(CC) -dumpmachine |grep -q 'mips.*el-.*' || echo -EL $(undef-all) $(predef-le)) cflags-$(CONFIG_CPU_HAS_SMARTMIPS) += $(call cc-option,-msmartmips) +cflags-$(CONFIG_CPU_MICROMIPS) += $(call cc-option,-mmicromips -mno-jals) cflags-$(CONFIG_SB1XXX_CORELIS) += $(call cc-option,-mno-sched-prolog) \ -fno-omit-frame-pointer diff --git a/arch/mips/alchemy/Kconfig b/arch/mips/alchemy/Kconfig index c8862bdc2ff2..7032ac7ecd1b 100644 --- a/arch/mips/alchemy/Kconfig +++ b/arch/mips/alchemy/Kconfig @@ -31,7 +31,6 @@ config MIPS_DB1000 select ALCHEMY_GPIOINT_AU1000 select DMA_NONCOHERENT select HW_HAS_PCI - select MIPS_DISABLE_OBSOLETE_IDE select SYS_SUPPORTS_BIG_ENDIAN select SYS_SUPPORTS_LITTLE_ENDIAN select SYS_HAS_EARLY_PRINTK @@ -41,7 +40,6 @@ config MIPS_DB1235 select ARCH_REQUIRE_GPIOLIB select HW_HAS_PCI select DMA_COHERENT - select MIPS_DISABLE_OBSOLETE_IDE select SYS_SUPPORTS_LITTLE_ENDIAN select SYS_HAS_EARLY_PRINTK @@ -57,7 +55,6 @@ config MIPS_GPR select ALCHEMY_GPIOINT_AU1000 select HW_HAS_PCI select DMA_NONCOHERENT - select MIPS_DISABLE_OBSOLETE_IDE select SYS_SUPPORTS_LITTLE_ENDIAN select SYS_HAS_EARLY_PRINTK diff --git a/arch/mips/alchemy/Platform b/arch/mips/alchemy/Platform index fa1bdd1aea15..b3afcdd8d77a 100644 --- a/arch/mips/alchemy/Platform +++ b/arch/mips/alchemy/Platform @@ -5,32 +5,14 @@ platform-$(CONFIG_MIPS_ALCHEMY) += alchemy/common/ # -# AMD Alchemy Pb1100 eval board -# -platform-$(CONFIG_MIPS_PB1100) += alchemy/devboards/ -load-$(CONFIG_MIPS_PB1100) += 0xffffffff80100000 - -# -# AMD Alchemy Pb1500 eval board -# -platform-$(CONFIG_MIPS_PB1500) += alchemy/devboards/ -load-$(CONFIG_MIPS_PB1500) += 0xffffffff80100000 - -# -# AMD Alchemy Pb1550 eval board -# -platform-$(CONFIG_MIPS_PB1550) += alchemy/devboards/ -load-$(CONFIG_MIPS_PB1550) += 0xffffffff80100000 - -# -# AMD Alchemy Db1000/Db1500/Db1100 eval boards +# AMD Alchemy Db1000/Db1500/Pb1500/Db1100/Pb1100 eval boards # platform-$(CONFIG_MIPS_DB1000) += alchemy/devboards/ cflags-$(CONFIG_MIPS_DB1000) += -I$(srctree)/arch/mips/include/asm/mach-db1x00 load-$(CONFIG_MIPS_DB1000) += 0xffffffff80100000 # -# AMD Alchemy Db1200/Pb1200/Db1550/Db1300 eval boards +# AMD Alchemy Db1200/Pb1200/Db1550/Pb1550/Db1300 eval boards # platform-$(CONFIG_MIPS_DB1235) += alchemy/devboards/ cflags-$(CONFIG_MIPS_DB1235) += -I$(srctree)/arch/mips/include/asm/mach-db1x00 diff --git a/arch/mips/ar7/memory.c b/arch/mips/ar7/memory.c index 28abfeef09d6..92dfa481205b 100644 --- a/arch/mips/ar7/memory.c +++ b/arch/mips/ar7/memory.c @@ -30,7 +30,6 @@ #include #include -#include static int __init memsize(void) { diff --git a/arch/mips/ath79/setup.c b/arch/mips/ath79/setup.c index d5b3c9057018..a0233a2c1988 100644 --- a/arch/mips/ath79/setup.c +++ b/arch/mips/ath79/setup.c @@ -51,20 +51,6 @@ static void ath79_halt(void) cpu_wait(); } -static void __init ath79_detect_mem_size(void) -{ - unsigned long size; - - for (size = ATH79_MEM_SIZE_MIN; size < ATH79_MEM_SIZE_MAX; - size <<= 1) { - if (!memcmp(ath79_detect_mem_size, - ath79_detect_mem_size + size, 1024)) - break; - } - - add_memory_region(0, size, BOOT_MEM_RAM); -} - static void __init ath79_detect_sys_type(void) { char *chip = "????"; @@ -212,7 +198,7 @@ void __init plat_mem_setup(void) AR71XX_DDR_CTRL_SIZE); ath79_detect_sys_type(); - ath79_detect_mem_size(); + detect_memory_region(0, ATH79_MEM_SIZE_MIN, ATH79_MEM_SIZE_MAX); ath79_clocks_init(); _machine_restart = ath79_restart; diff --git a/arch/mips/bcm63xx/Kconfig b/arch/mips/bcm63xx/Kconfig index d03e8799d1cf..5639662fd503 100644 --- a/arch/mips/bcm63xx/Kconfig +++ b/arch/mips/bcm63xx/Kconfig @@ -25,6 +25,10 @@ config BCM63XX_CPU_6358 bool "support 6358 CPU" select HW_HAS_PCI +config BCM63XX_CPU_6362 + bool "support 6362 CPU" + select HW_HAS_PCI + config BCM63XX_CPU_6368 bool "support 6368 CPU" select HW_HAS_PCI diff --git a/arch/mips/bcm63xx/boards/board_bcm963xx.c b/arch/mips/bcm63xx/boards/board_bcm963xx.c index 9aa7d44898ed..a9505c4867e8 100644 --- a/arch/mips/bcm63xx/boards/board_bcm963xx.c +++ b/arch/mips/bcm63xx/boards/board_bcm963xx.c @@ -726,11 +726,11 @@ void __init board_prom_init(void) u32 val; /* read base address of boot chip select (0) - * 6328 does not have MPI but boots from a fixed address + * 6328/6362 do not have MPI but boot from a fixed address */ - if (BCMCPU_IS_6328()) + if (BCMCPU_IS_6328() || BCMCPU_IS_6362()) { val = 0x18000000; - else { + } else { val = bcm_mpi_readl(MPI_CSBASE_REG(0)); val &= MPI_CSBASE_BASE_MASK; } diff --git a/arch/mips/bcm63xx/clk.c b/arch/mips/bcm63xx/clk.c index b9e948d59430..c726a97fc798 100644 --- a/arch/mips/bcm63xx/clk.c +++ b/arch/mips/bcm63xx/clk.c @@ -15,7 +15,13 @@ #include #include #include -#include + +struct clk { + void (*set)(struct clk *, int); + unsigned int rate; + unsigned int usage; + int id; +}; static DEFINE_MUTEX(clocks_mutex); @@ -119,11 +125,18 @@ static struct clk clk_ephy = { */ static void enetsw_set(struct clk *clk, int enable) { - if (!BCMCPU_IS_6368()) + if (BCMCPU_IS_6328()) + bcm_hwclock_set(CKCTL_6328_ROBOSW_EN, enable); + else if (BCMCPU_IS_6362()) + bcm_hwclock_set(CKCTL_6362_ROBOSW_EN, enable); + else if (BCMCPU_IS_6368()) + bcm_hwclock_set(CKCTL_6368_ROBOSW_EN | + CKCTL_6368_SWPKT_USB_EN | + CKCTL_6368_SWPKT_SAR_EN, + enable); + else return; - bcm_hwclock_set(CKCTL_6368_ROBOSW_EN | - CKCTL_6368_SWPKT_USB_EN | - CKCTL_6368_SWPKT_SAR_EN, enable); + if (enable) { /* reset switch core afer clock change */ bcm63xx_core_set_reset(BCM63XX_RESET_ENETSW, 1); @@ -160,6 +173,8 @@ static void usbh_set(struct clk *clk, int enable) bcm_hwclock_set(CKCTL_6328_USBH_EN, enable); else if (BCMCPU_IS_6348()) bcm_hwclock_set(CKCTL_6348_USBH_EN, enable); + else if (BCMCPU_IS_6362()) + bcm_hwclock_set(CKCTL_6362_USBH_EN, enable); else if (BCMCPU_IS_6368()) bcm_hwclock_set(CKCTL_6368_USBH_EN, enable); } @@ -175,6 +190,8 @@ static void usbd_set(struct clk *clk, int enable) { if (BCMCPU_IS_6328()) bcm_hwclock_set(CKCTL_6328_USBD_EN, enable); + else if (BCMCPU_IS_6362()) + bcm_hwclock_set(CKCTL_6362_USBD_EN, enable); else if (BCMCPU_IS_6368()) bcm_hwclock_set(CKCTL_6368_USBD_EN, enable); } @@ -196,6 +213,8 @@ static void spi_set(struct clk *clk, int enable) mask = CKCTL_6348_SPI_EN; else if (BCMCPU_IS_6358()) mask = CKCTL_6358_SPI_EN; + else if (BCMCPU_IS_6362()) + mask = CKCTL_6362_SPI_EN; else /* BCMCPU_IS_6368 */ mask = CKCTL_6368_SPI_EN; @@ -236,7 +255,10 @@ static struct clk clk_xtm = { */ static void ipsec_set(struct clk *clk, int enable) { - bcm_hwclock_set(CKCTL_6368_IPSEC_EN, enable); + if (BCMCPU_IS_6362()) + bcm_hwclock_set(CKCTL_6362_IPSEC_EN, enable); + else if (BCMCPU_IS_6368()) + bcm_hwclock_set(CKCTL_6368_IPSEC_EN, enable); } static struct clk clk_ipsec = { @@ -249,7 +271,10 @@ static struct clk clk_ipsec = { static void pcie_set(struct clk *clk, int enable) { - bcm_hwclock_set(CKCTL_6328_PCIE_EN, enable); + if (BCMCPU_IS_6328()) + bcm_hwclock_set(CKCTL_6328_PCIE_EN, enable); + else if (BCMCPU_IS_6362()) + bcm_hwclock_set(CKCTL_6362_PCIE_EN, enable); } static struct clk clk_pcie = { @@ -315,9 +340,9 @@ struct clk *clk_get(struct device *dev, const char *id) return &clk_periph; if (BCMCPU_IS_6358() && !strcmp(id, "pcm")) return &clk_pcm; - if (BCMCPU_IS_6368() && !strcmp(id, "ipsec")) + if ((BCMCPU_IS_6362() || BCMCPU_IS_6368()) && !strcmp(id, "ipsec")) return &clk_ipsec; - if (BCMCPU_IS_6328() && !strcmp(id, "pcie")) + if ((BCMCPU_IS_6328() || BCMCPU_IS_6362()) && !strcmp(id, "pcie")) return &clk_pcie; return ERR_PTR(-ENOENT); } diff --git a/arch/mips/bcm63xx/cpu.c b/arch/mips/bcm63xx/cpu.c index a7afb289b15a..79fe32df5e96 100644 --- a/arch/mips/bcm63xx/cpu.c +++ b/arch/mips/bcm63xx/cpu.c @@ -25,7 +25,7 @@ const int *bcm63xx_irqs; EXPORT_SYMBOL(bcm63xx_irqs); static u16 bcm63xx_cpu_id; -static u16 bcm63xx_cpu_rev; +static u8 bcm63xx_cpu_rev; static unsigned int bcm63xx_cpu_freq; static unsigned int bcm63xx_memory_size; @@ -71,6 +71,15 @@ static const int bcm6358_irqs[] = { }; +static const unsigned long bcm6362_regs_base[] = { + __GEN_CPU_REGS_TABLE(6362) +}; + +static const int bcm6362_irqs[] = { + __GEN_CPU_IRQ_TABLE(6362) + +}; + static const unsigned long bcm6368_regs_base[] = { __GEN_CPU_REGS_TABLE(6368) }; @@ -87,7 +96,7 @@ u16 __bcm63xx_get_cpu_id(void) EXPORT_SYMBOL(__bcm63xx_get_cpu_id); -u16 bcm63xx_get_cpu_rev(void) +u8 bcm63xx_get_cpu_rev(void) { return bcm63xx_cpu_rev; } @@ -169,6 +178,42 @@ static unsigned int detect_cpu_clock(void) return (16 * 1000000 * n1 * n2) / m1; } + case BCM6362_CPU_ID: + { + unsigned int tmp, mips_pll_fcvo; + + tmp = bcm_misc_readl(MISC_STRAPBUS_6362_REG); + mips_pll_fcvo = (tmp & STRAPBUS_6362_FCVO_MASK) + >> STRAPBUS_6362_FCVO_SHIFT; + switch (mips_pll_fcvo) { + case 0x03: + case 0x0b: + case 0x13: + case 0x1b: + return 240000000; + case 0x04: + case 0x0c: + case 0x14: + case 0x1c: + return 160000000; + case 0x05: + case 0x0e: + case 0x16: + case 0x1e: + case 0x1f: + return 400000000; + case 0x06: + return 440000000; + case 0x07: + case 0x17: + return 384000000; + case 0x15: + case 0x1d: + return 200000000; + default: + return 320000000; + } + } case BCM6368_CPU_ID: { unsigned int tmp, p1, p2, ndiv, m1; @@ -205,7 +250,7 @@ static unsigned int detect_memory_size(void) unsigned int cols = 0, rows = 0, is_32bits = 0, banks = 0; u32 val; - if (BCMCPU_IS_6328()) + if (BCMCPU_IS_6328() || BCMCPU_IS_6362()) return bcm_ddr_readl(DDR_CSEND_REG) << 24; if (BCMCPU_IS_6345()) { @@ -240,53 +285,27 @@ static unsigned int detect_memory_size(void) void __init bcm63xx_cpu_init(void) { - unsigned int tmp, expected_cpu_id; + unsigned int tmp; struct cpuinfo_mips *c = ¤t_cpu_data; unsigned int cpu = smp_processor_id(); + u32 chipid_reg; /* soc registers location depends on cpu type */ - expected_cpu_id = 0; + chipid_reg = 0; switch (c->cputype) { case CPU_BMIPS3300: - if ((read_c0_prid() & 0xff00) == PRID_IMP_BMIPS3300_ALT) { - expected_cpu_id = BCM6348_CPU_ID; - bcm63xx_regs_base = bcm6348_regs_base; - bcm63xx_irqs = bcm6348_irqs; - } else { + if ((read_c0_prid() & 0xff00) != PRID_IMP_BMIPS3300_ALT) __cpu_name[cpu] = "Broadcom BCM6338"; - expected_cpu_id = BCM6338_CPU_ID; - bcm63xx_regs_base = bcm6338_regs_base; - bcm63xx_irqs = bcm6338_irqs; - } - break; + /* fall-through */ case CPU_BMIPS32: - expected_cpu_id = BCM6345_CPU_ID; - bcm63xx_regs_base = bcm6345_regs_base; - bcm63xx_irqs = bcm6345_irqs; + chipid_reg = BCM_6345_PERF_BASE; break; case CPU_BMIPS4350: - if ((read_c0_prid() & 0xf0) == 0x10) { - expected_cpu_id = BCM6358_CPU_ID; - bcm63xx_regs_base = bcm6358_regs_base; - bcm63xx_irqs = bcm6358_irqs; - } else { - /* all newer chips have the same chip id location */ - u16 chip_id = bcm_readw(BCM_6368_PERF_BASE); - - switch (chip_id) { - case BCM6328_CPU_ID: - expected_cpu_id = BCM6328_CPU_ID; - bcm63xx_regs_base = bcm6328_regs_base; - bcm63xx_irqs = bcm6328_irqs; - break; - case BCM6368_CPU_ID: - expected_cpu_id = BCM6368_CPU_ID; - bcm63xx_regs_base = bcm6368_regs_base; - bcm63xx_irqs = bcm6368_irqs; - break; - } - } + if ((read_c0_prid() & 0xf0) == 0x10) + chipid_reg = BCM_6345_PERF_BASE; + else + chipid_reg = BCM_6368_PERF_BASE; break; } @@ -294,20 +313,47 @@ void __init bcm63xx_cpu_init(void) * really early to panic, but delaying panic would not help since we * will never get any working console */ - if (!expected_cpu_id) + if (!chipid_reg) panic("unsupported Broadcom CPU"); - /* - * bcm63xx_regs_base is set, we can access soc registers - */ - - /* double check CPU type */ - tmp = bcm_perf_readl(PERF_REV_REG); + /* read out CPU type */ + tmp = bcm_readl(chipid_reg); bcm63xx_cpu_id = (tmp & REV_CHIPID_MASK) >> REV_CHIPID_SHIFT; bcm63xx_cpu_rev = (tmp & REV_REVID_MASK) >> REV_REVID_SHIFT; - if (bcm63xx_cpu_id != expected_cpu_id) - panic("bcm63xx CPU id mismatch"); + switch (bcm63xx_cpu_id) { + case BCM6328_CPU_ID: + bcm63xx_regs_base = bcm6328_regs_base; + bcm63xx_irqs = bcm6328_irqs; + break; + case BCM6338_CPU_ID: + bcm63xx_regs_base = bcm6338_regs_base; + bcm63xx_irqs = bcm6338_irqs; + break; + case BCM6345_CPU_ID: + bcm63xx_regs_base = bcm6345_regs_base; + bcm63xx_irqs = bcm6345_irqs; + break; + case BCM6348_CPU_ID: + bcm63xx_regs_base = bcm6348_regs_base; + bcm63xx_irqs = bcm6348_irqs; + break; + case BCM6358_CPU_ID: + bcm63xx_regs_base = bcm6358_regs_base; + bcm63xx_irqs = bcm6358_irqs; + break; + case BCM6362_CPU_ID: + bcm63xx_regs_base = bcm6362_regs_base; + bcm63xx_irqs = bcm6362_irqs; + break; + case BCM6368_CPU_ID: + bcm63xx_regs_base = bcm6368_regs_base; + bcm63xx_irqs = bcm6368_irqs; + break; + default: + panic("unsupported broadcom CPU %x", bcm63xx_cpu_id); + break; + } bcm63xx_cpu_freq = detect_cpu_clock(); bcm63xx_memory_size = detect_memory_size(); diff --git a/arch/mips/bcm63xx/dev-flash.c b/arch/mips/bcm63xx/dev-flash.c index 58371c7deac2..588d1ec622e4 100644 --- a/arch/mips/bcm63xx/dev-flash.c +++ b/arch/mips/bcm63xx/dev-flash.c @@ -77,6 +77,12 @@ static int __init bcm63xx_detect_flash_type(void) return BCM63XX_FLASH_TYPE_PARALLEL; else return BCM63XX_FLASH_TYPE_SERIAL; + case BCM6362_CPU_ID: + val = bcm_misc_readl(MISC_STRAPBUS_6362_REG); + if (val & STRAPBUS_6362_BOOT_SEL_SERIAL) + return BCM63XX_FLASH_TYPE_SERIAL; + else + return BCM63XX_FLASH_TYPE_NAND; case BCM6368_CPU_ID: val = bcm_gpio_readl(GPIO_STRAPBUS_REG); switch (val & STRAPBUS_6368_BOOT_SEL_MASK) { diff --git a/arch/mips/bcm63xx/dev-spi.c b/arch/mips/bcm63xx/dev-spi.c index e97fd60e92ef..3065bb61820d 100644 --- a/arch/mips/bcm63xx/dev-spi.c +++ b/arch/mips/bcm63xx/dev-spi.c @@ -22,10 +22,6 @@ /* * register offsets */ -static const unsigned long bcm6338_regs_spi[] = { - __GEN_SPI_REGS_TABLE(6338) -}; - static const unsigned long bcm6348_regs_spi[] = { __GEN_SPI_REGS_TABLE(6348) }; @@ -34,23 +30,15 @@ static const unsigned long bcm6358_regs_spi[] = { __GEN_SPI_REGS_TABLE(6358) }; -static const unsigned long bcm6368_regs_spi[] = { - __GEN_SPI_REGS_TABLE(6368) -}; - const unsigned long *bcm63xx_regs_spi; EXPORT_SYMBOL(bcm63xx_regs_spi); static __init void bcm63xx_spi_regs_init(void) { - if (BCMCPU_IS_6338()) - bcm63xx_regs_spi = bcm6338_regs_spi; - if (BCMCPU_IS_6348()) + if (BCMCPU_IS_6338() || BCMCPU_IS_6348()) bcm63xx_regs_spi = bcm6348_regs_spi; - if (BCMCPU_IS_6358()) + if (BCMCPU_IS_6358() || BCMCPU_IS_6362() || BCMCPU_IS_6368()) bcm63xx_regs_spi = bcm6358_regs_spi; - if (BCMCPU_IS_6368()) - bcm63xx_regs_spi = bcm6368_regs_spi; } #else static __init void bcm63xx_spi_regs_init(void) { } @@ -93,13 +81,13 @@ int __init bcm63xx_spi_register(void) spi_resources[1].start = bcm63xx_get_irq_number(IRQ_SPI); if (BCMCPU_IS_6338() || BCMCPU_IS_6348()) { - spi_resources[0].end += BCM_6338_RSET_SPI_SIZE - 1; - spi_pdata.fifo_size = SPI_6338_MSG_DATA_SIZE; - spi_pdata.msg_type_shift = SPI_6338_MSG_TYPE_SHIFT; - spi_pdata.msg_ctl_width = SPI_6338_MSG_CTL_WIDTH; + spi_resources[0].end += BCM_6348_RSET_SPI_SIZE - 1; + spi_pdata.fifo_size = SPI_6348_MSG_DATA_SIZE; + spi_pdata.msg_type_shift = SPI_6348_MSG_TYPE_SHIFT; + spi_pdata.msg_ctl_width = SPI_6348_MSG_CTL_WIDTH; } - if (BCMCPU_IS_6358() || BCMCPU_IS_6368()) { + if (BCMCPU_IS_6358() || BCMCPU_IS_6362() || BCMCPU_IS_6368()) { spi_resources[0].end += BCM_6358_RSET_SPI_SIZE - 1; spi_pdata.fifo_size = SPI_6358_MSG_DATA_SIZE; spi_pdata.msg_type_shift = SPI_6358_MSG_TYPE_SHIFT; diff --git a/arch/mips/bcm63xx/irq.c b/arch/mips/bcm63xx/irq.c index da24c2bd9b7c..c0ab3887f42e 100644 --- a/arch/mips/bcm63xx/irq.c +++ b/arch/mips/bcm63xx/irq.c @@ -82,6 +82,17 @@ static void __internal_irq_unmask_64(unsigned int irq) __maybe_unused; #define ext_irq_cfg_reg1 PERF_EXTIRQ_CFG_REG_6358 #define ext_irq_cfg_reg2 0 #endif +#ifdef CONFIG_BCM63XX_CPU_6362 +#define irq_stat_reg PERF_IRQSTAT_6362_REG +#define irq_mask_reg PERF_IRQMASK_6362_REG +#define irq_bits 64 +#define is_ext_irq_cascaded 1 +#define ext_irq_start (BCM_6362_EXT_IRQ0 - IRQ_INTERNAL_BASE) +#define ext_irq_end (BCM_6362_EXT_IRQ3 - IRQ_INTERNAL_BASE) +#define ext_irq_count 4 +#define ext_irq_cfg_reg1 PERF_EXTIRQ_CFG_REG_6362 +#define ext_irq_cfg_reg2 0 +#endif #ifdef CONFIG_BCM63XX_CPU_6368 #define irq_stat_reg PERF_IRQSTAT_6368_REG #define irq_mask_reg PERF_IRQMASK_6368_REG @@ -170,6 +181,16 @@ static void bcm63xx_init_irq(void) ext_irq_end = BCM_6358_EXT_IRQ3 - IRQ_INTERNAL_BASE; ext_irq_cfg_reg1 = PERF_EXTIRQ_CFG_REG_6358; break; + case BCM6362_CPU_ID: + irq_stat_addr += PERF_IRQSTAT_6362_REG; + irq_mask_addr += PERF_IRQMASK_6362_REG; + irq_bits = 64; + ext_irq_count = 4; + is_ext_irq_cascaded = 1; + ext_irq_start = BCM_6362_EXT_IRQ0 - IRQ_INTERNAL_BASE; + ext_irq_end = BCM_6362_EXT_IRQ3 - IRQ_INTERNAL_BASE; + ext_irq_cfg_reg1 = PERF_EXTIRQ_CFG_REG_6362; + break; case BCM6368_CPU_ID: irq_stat_addr += PERF_IRQSTAT_6368_REG; irq_mask_addr += PERF_IRQMASK_6368_REG; @@ -458,6 +479,7 @@ static int bcm63xx_external_irq_set_type(struct irq_data *d, case BCM6338_CPU_ID: case BCM6345_CPU_ID: case BCM6358_CPU_ID: + case BCM6362_CPU_ID: case BCM6368_CPU_ID: if (levelsense) reg |= EXTIRQ_CFG_LEVELSENSE(irq); diff --git a/arch/mips/bcm63xx/prom.c b/arch/mips/bcm63xx/prom.c index 10eaff458071..fd698087fbfd 100644 --- a/arch/mips/bcm63xx/prom.c +++ b/arch/mips/bcm63xx/prom.c @@ -36,6 +36,8 @@ void __init prom_init(void) mask = CKCTL_6348_ALL_SAFE_EN; else if (BCMCPU_IS_6358()) mask = CKCTL_6358_ALL_SAFE_EN; + else if (BCMCPU_IS_6362()) + mask = CKCTL_6362_ALL_SAFE_EN; else if (BCMCPU_IS_6368()) mask = CKCTL_6368_ALL_SAFE_EN; else diff --git a/arch/mips/bcm63xx/reset.c b/arch/mips/bcm63xx/reset.c index 68a31bb90cbf..317931c6cf58 100644 --- a/arch/mips/bcm63xx/reset.c +++ b/arch/mips/bcm63xx/reset.c @@ -85,6 +85,20 @@ #define BCM6358_RESET_PCIE 0 #define BCM6358_RESET_PCIE_EXT 0 +#define BCM6362_RESET_SPI SOFTRESET_6362_SPI_MASK +#define BCM6362_RESET_ENET 0 +#define BCM6362_RESET_USBH SOFTRESET_6362_USBH_MASK +#define BCM6362_RESET_USBD SOFTRESET_6362_USBS_MASK +#define BCM6362_RESET_DSL 0 +#define BCM6362_RESET_SAR SOFTRESET_6362_SAR_MASK +#define BCM6362_RESET_EPHY SOFTRESET_6362_EPHY_MASK +#define BCM6362_RESET_ENETSW SOFTRESET_6362_ENETSW_MASK +#define BCM6362_RESET_PCM SOFTRESET_6362_PCM_MASK +#define BCM6362_RESET_MPI 0 +#define BCM6362_RESET_PCIE (SOFTRESET_6362_PCIE_MASK | \ + SOFTRESET_6362_PCIE_CORE_MASK) +#define BCM6362_RESET_PCIE_EXT SOFTRESET_6362_PCIE_EXT_MASK + #define BCM6368_RESET_SPI SOFTRESET_6368_SPI_MASK #define BCM6368_RESET_ENET 0 #define BCM6368_RESET_USBH SOFTRESET_6368_USBH_MASK @@ -119,6 +133,10 @@ static const u32 bcm6358_reset_bits[] = { __GEN_RESET_BITS_TABLE(6358) }; +static const u32 bcm6362_reset_bits[] = { + __GEN_RESET_BITS_TABLE(6362) +}; + static const u32 bcm6368_reset_bits[] = { __GEN_RESET_BITS_TABLE(6368) }; @@ -140,6 +158,9 @@ static int __init bcm63xx_reset_bits_init(void) } else if (BCMCPU_IS_6358()) { reset_reg = PERF_SOFTRESET_6358_REG; bcm63xx_reset_bits = bcm6358_reset_bits; + } else if (BCMCPU_IS_6362()) { + reset_reg = PERF_SOFTRESET_6362_REG; + bcm63xx_reset_bits = bcm6362_reset_bits; } else if (BCMCPU_IS_6368()) { reset_reg = PERF_SOFTRESET_6368_REG; bcm63xx_reset_bits = bcm6368_reset_bits; @@ -182,6 +203,13 @@ static const u32 bcm63xx_reset_bits[] = { #define reset_reg PERF_SOFTRESET_6358_REG #endif +#ifdef CONFIG_BCM63XX_CPU_6362 +static const u32 bcm63xx_reset_bits[] = { + __GEN_RESET_BITS_TABLE(6362) +}; +#define reset_reg PERF_SOFTRESET_6362_REG +#endif + #ifdef CONFIG_BCM63XX_CPU_6368 static const u32 bcm63xx_reset_bits[] = { __GEN_RESET_BITS_TABLE(6368) diff --git a/arch/mips/bcm63xx/setup.c b/arch/mips/bcm63xx/setup.c index 35e18e98beb9..24a24445db64 100644 --- a/arch/mips/bcm63xx/setup.c +++ b/arch/mips/bcm63xx/setup.c @@ -83,6 +83,9 @@ void bcm63xx_machine_reboot(void) case BCM6358_CPU_ID: perf_regs[0] = PERF_EXTIRQ_CFG_REG_6358; break; + case BCM6362_CPU_ID: + perf_regs[0] = PERF_EXTIRQ_CFG_REG_6362; + break; } for (i = 0; i < 2; i++) { @@ -126,7 +129,7 @@ static void __bcm63xx_machine_reboot(char *p) const char *get_system_type(void) { static char buf[128]; - snprintf(buf, sizeof(buf), "bcm63xx/%s (0x%04x/0x%04X)", + snprintf(buf, sizeof(buf), "bcm63xx/%s (0x%04x/0x%02X)", board_get_name(), bcm63xx_get_cpu_id(), bcm63xx_get_cpu_rev()); return buf; diff --git a/arch/mips/cavium-octeon/octeon-irq.c b/arch/mips/cavium-octeon/octeon-irq.c index 156aa6143e11..a22f06a6f7ca 100644 --- a/arch/mips/cavium-octeon/octeon-irq.c +++ b/arch/mips/cavium-octeon/octeon-irq.c @@ -1032,9 +1032,8 @@ static int octeon_irq_gpio_map_common(struct irq_domain *d, if (!octeon_irq_virq_in_range(virq)) return -EINVAL; - hw += gpiod->base_hwirq; - line = hw >> 6; - bit = hw & 63; + line = (hw + gpiod->base_hwirq) >> 6; + bit = (hw + gpiod->base_hwirq) & 63; if (line > line_limit || octeon_irq_ciu_to_irq[line][bit] != 0) return -EINVAL; diff --git a/arch/mips/configs/malta_defconfig b/arch/mips/configs/malta_defconfig index cd732e5b4fd5..ce1d3eeeb737 100644 --- a/arch/mips/configs/malta_defconfig +++ b/arch/mips/configs/malta_defconfig @@ -2,30 +2,21 @@ CONFIG_MIPS_MALTA=y CONFIG_CPU_LITTLE_ENDIAN=y CONFIG_CPU_MIPS32_R2=y CONFIG_MIPS_MT_SMP=y -CONFIG_NO_HZ=y -CONFIG_HIGH_RES_TIMERS=y CONFIG_HZ_100=y -CONFIG_EXPERIMENTAL=y CONFIG_SYSVIPC=y +CONFIG_NO_HZ=y +CONFIG_HIGH_RES_TIMERS=y CONFIG_LOG_BUF_SHIFT=15 -CONFIG_SYSFS_DEPRECATED_V2=y -CONFIG_RELAY=y CONFIG_NAMESPACES=y -CONFIG_UTS_NS=y -CONFIG_IPC_NS=y -CONFIG_PID_NS=y -# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set +CONFIG_RELAY=y CONFIG_EXPERT=y -# CONFIG_SYSCTL_SYSCALL is not set # CONFIG_COMPAT_BRK is not set CONFIG_SLAB=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y -# CONFIG_BLK_DEV_BSG is not set CONFIG_PCI=y -CONFIG_PM=y CONFIG_PACKET=y CONFIG_UNIX=y CONFIG_XFRM_USER=m @@ -41,8 +32,6 @@ CONFIG_IP_PNP=y CONFIG_IP_PNP_DHCP=y CONFIG_IP_PNP_BOOTP=y CONFIG_NET_IPIP=m -CONFIG_NET_IPGRE=m -CONFIG_NET_IPGRE_BROADCAST=y CONFIG_IP_MROUTE=y CONFIG_IP_PIMSM_V1=y CONFIG_IP_PIMSM_V2=y @@ -65,7 +54,6 @@ CONFIG_IPV6_MROUTE=y CONFIG_IPV6_PIMSM_V2=y CONFIG_NETWORK_SECMARK=y CONFIG_NETFILTER=y -CONFIG_NETFILTER_NETLINK_QUEUE=m CONFIG_NF_CONNTRACK=m CONFIG_NF_CONNTRACK_SECMARK=y CONFIG_NF_CONNTRACK_EVENTS=y @@ -136,23 +124,15 @@ CONFIG_IP_VS_DH=m CONFIG_IP_VS_SH=m CONFIG_IP_VS_SED=m CONFIG_IP_VS_NQ=m -CONFIG_IP_VS_FTP=m CONFIG_NF_CONNTRACK_IPV4=m CONFIG_IP_NF_QUEUE=m CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m CONFIG_IP_NF_MATCH_AH=m CONFIG_IP_NF_MATCH_ECN=m CONFIG_IP_NF_MATCH_TTL=m CONFIG_IP_NF_FILTER=m CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_NF_NAT=m -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_NF_NAT_SNMP_BASIC=m CONFIG_IP_NF_MANGLE=m CONFIG_IP_NF_TARGET_CLUSTERIP=m CONFIG_IP_NF_TARGET_ECN=m @@ -162,8 +142,6 @@ CONFIG_IP_NF_ARPTABLES=m CONFIG_IP_NF_ARPFILTER=m CONFIG_IP_NF_ARP_MANGLE=m CONFIG_NF_CONNTRACK_IPV6=m -CONFIG_IP6_NF_QUEUE=m -CONFIG_IP6_NF_IPTABLES=m CONFIG_IP6_NF_MATCH_AH=m CONFIG_IP6_NF_MATCH_EUI64=m CONFIG_IP6_NF_MATCH_FRAG=m @@ -173,7 +151,6 @@ CONFIG_IP6_NF_MATCH_IPV6HEADER=m CONFIG_IP6_NF_MATCH_MH=m CONFIG_IP6_NF_MATCH_RT=m CONFIG_IP6_NF_TARGET_HL=m -CONFIG_IP6_NF_TARGET_LOG=m CONFIG_IP6_NF_FILTER=m CONFIG_IP6_NF_TARGET_REJECT=m CONFIG_IP6_NF_MANGLE=m @@ -247,12 +224,10 @@ CONFIG_MAC80211=m CONFIG_MAC80211_RC_PID=y CONFIG_MAC80211_RC_DEFAULT_PID=y CONFIG_MAC80211_MESH=y -CONFIG_MAC80211_LEDS=y CONFIG_RFKILL=m CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" CONFIG_CONNECTOR=m CONFIG_MTD=y -CONFIG_MTD_PARTITIONS=y CONFIG_MTD_CHAR=y CONFIG_MTD_BLOCK=y CONFIG_MTD_OOPS=m @@ -271,7 +246,6 @@ CONFIG_BLK_DEV_NBD=m CONFIG_BLK_DEV_RAM=y CONFIG_CDROM_PKTCDVD=m CONFIG_ATA_OVER_ETH=m -# CONFIG_MISC_DEVICES is not set CONFIG_IDE=y CONFIG_BLK_DEV_IDECD=y CONFIG_IDE_GENERIC=y @@ -317,13 +291,19 @@ CONFIG_DM_MIRROR=m CONFIG_DM_ZERO=m CONFIG_DM_MULTIPATH=m CONFIG_NETDEVICES=y -CONFIG_IFB=m -CONFIG_DUMMY=m CONFIG_BONDING=m -CONFIG_MACVLAN=m +CONFIG_DUMMY=m CONFIG_EQUALIZER=m +CONFIG_IFB=m +CONFIG_MACVLAN=m CONFIG_TUN=m CONFIG_VETH=m +# CONFIG_NET_VENDOR_3COM is not set +CONFIG_PCNET32=y +CONFIG_CHELSIO_T3=m +CONFIG_AX88796=m +CONFIG_NETXEN_NIC=m +CONFIG_TC35815=m CONFIG_MARVELL_PHY=m CONFIG_DAVICOM_PHY=m CONFIG_QSEMI_PHY=m @@ -334,14 +314,6 @@ CONFIG_SMSC_PHY=m CONFIG_BROADCOM_PHY=m CONFIG_ICPLUS_PHY=m CONFIG_REALTEK_PHY=m -CONFIG_MDIO_BITBANG=m -CONFIG_NET_ETHERNET=y -CONFIG_AX88796=m -CONFIG_NET_PCI=y -CONFIG_PCNET32=y -CONFIG_TC35815=m -CONFIG_CHELSIO_T3=m -CONFIG_NETXEN_NIC=m CONFIG_ATMEL=m CONFIG_PCI_ATMEL=m CONFIG_PRISM54=m @@ -352,15 +324,7 @@ CONFIG_HOSTAP_PLX=m CONFIG_HOSTAP_PCI=m CONFIG_IPW2100=m CONFIG_IPW2100_MONITOR=y -CONFIG_IPW2200=m -CONFIG_IPW2200_MONITOR=y -CONFIG_IPW2200_PROMISCUOUS=y -CONFIG_IPW2200_QOS=y CONFIG_LIBERTAS=m -CONFIG_HERMES=m -CONFIG_PLX_HERMES=m -CONFIG_TMD_HERMES=m -CONFIG_NORTEL_HERMES=m # CONFIG_INPUT_KEYBOARD is not set # CONFIG_INPUT_MOUSE is not set # CONFIG_SERIO_I8042 is not set @@ -373,12 +337,6 @@ CONFIG_FB_CIRRUS=y # CONFIG_VGA_CONSOLE is not set CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_HID=m -CONFIG_LEDS_CLASS=y -CONFIG_LEDS_TRIGGER_TIMER=m -CONFIG_LEDS_TRIGGER_IDE_DISK=y -CONFIG_LEDS_TRIGGER_HEARTBEAT=m -CONFIG_LEDS_TRIGGER_BACKLIGHT=m -CONFIG_LEDS_TRIGGER_DEFAULT_ON=m CONFIG_RTC_CLASS=y CONFIG_RTC_DRV_CMOS=y CONFIG_UIO=m @@ -398,7 +356,6 @@ CONFIG_XFS_QUOTA=y CONFIG_XFS_POSIX_ACL=y CONFIG_QUOTA=y CONFIG_QFMT_V2=y -CONFIG_AUTOFS_FS=y CONFIG_FUSE_FS=m CONFIG_ISO9660_FS=m CONFIG_JOLIET=y @@ -425,7 +382,6 @@ CONFIG_ROMFS_FS=m CONFIG_SYSV_FS=m CONFIG_UFS_FS=m CONFIG_NFS_FS=y -CONFIG_NFS_V3=y CONFIG_ROOT_NFS=y CONFIG_NFSD=y CONFIG_NFSD_V3=y @@ -466,7 +422,6 @@ CONFIG_NLS_ISO8859_14=m CONFIG_NLS_ISO8859_15=m CONFIG_NLS_KOI8_R=m CONFIG_NLS_KOI8_U=m -# CONFIG_RCU_CPU_STALL_DETECTOR is not set CONFIG_CRYPTO_NULL=m CONFIG_CRYPTO_CRYPTD=m CONFIG_CRYPTO_LRW=m diff --git a/arch/mips/configs/malta_kvm_defconfig b/arch/mips/configs/malta_kvm_defconfig new file mode 100644 index 000000000000..341bb47204d6 --- /dev/null +++ b/arch/mips/configs/malta_kvm_defconfig @@ -0,0 +1,456 @@ +CONFIG_MIPS_MALTA=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_CPU_MIPS32_R2=y +CONFIG_PAGE_SIZE_16KB=y +CONFIG_MIPS_MT_SMP=y +CONFIG_HZ_100=y +CONFIG_SYSVIPC=y +CONFIG_NO_HZ=y +CONFIG_HIGH_RES_TIMERS=y +CONFIG_LOG_BUF_SHIFT=15 +CONFIG_NAMESPACES=y +CONFIG_RELAY=y +CONFIG_EXPERT=y +CONFIG_PERF_EVENTS=y +# CONFIG_COMPAT_BRK is not set +CONFIG_SLAB=y +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODVERSIONS=y +CONFIG_MODULE_SRCVERSION_ALL=y +CONFIG_PCI=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=y +CONFIG_NET_KEY_MIGRATE=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +CONFIG_NET_IPIP=m +CONFIG_IP_MROUTE=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +CONFIG_INET_XFRM_MODE_TRANSPORT=m +CONFIG_INET_XFRM_MODE_TUNNEL=m +CONFIG_TCP_MD5SIG=y +CONFIG_IPV6_PRIVACY=y +CONFIG_IPV6_ROUTER_PREF=y +CONFIG_IPV6_ROUTE_INFO=y +CONFIG_IPV6_OPTIMISTIC_DAD=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_TUNNEL=m +CONFIG_IPV6_MROUTE=y +CONFIG_IPV6_PIMSM_V2=y +CONFIG_NETWORK_SECMARK=y +CONFIG_NETFILTER=y +CONFIG_NF_CONNTRACK=m +CONFIG_NF_CONNTRACK_SECMARK=y +CONFIG_NF_CONNTRACK_EVENTS=y +CONFIG_NF_CT_PROTO_DCCP=m +CONFIG_NF_CT_PROTO_UDPLITE=m +CONFIG_NF_CONNTRACK_AMANDA=m +CONFIG_NF_CONNTRACK_FTP=m +CONFIG_NF_CONNTRACK_H323=m +CONFIG_NF_CONNTRACK_IRC=m +CONFIG_NF_CONNTRACK_PPTP=m +CONFIG_NF_CONNTRACK_SANE=m +CONFIG_NF_CONNTRACK_SIP=m +CONFIG_NF_CONNTRACK_TFTP=m +CONFIG_NF_CT_NETLINK=m +CONFIG_NETFILTER_TPROXY=m +CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m +CONFIG_NETFILTER_XT_TARGET_CONNMARK=m +CONFIG_NETFILTER_XT_TARGET_MARK=m +CONFIG_NETFILTER_XT_TARGET_NFLOG=m +CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m +CONFIG_NETFILTER_XT_TARGET_TPROXY=m +CONFIG_NETFILTER_XT_TARGET_TRACE=m +CONFIG_NETFILTER_XT_TARGET_SECMARK=m +CONFIG_NETFILTER_XT_TARGET_TCPMSS=m +CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m +CONFIG_NETFILTER_XT_MATCH_COMMENT=m +CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m +CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m +CONFIG_NETFILTER_XT_MATCH_CONNMARK=m +CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m +CONFIG_NETFILTER_XT_MATCH_DCCP=m +CONFIG_NETFILTER_XT_MATCH_ESP=m +CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m +CONFIG_NETFILTER_XT_MATCH_HELPER=m +CONFIG_NETFILTER_XT_MATCH_IPRANGE=m +CONFIG_NETFILTER_XT_MATCH_LENGTH=m +CONFIG_NETFILTER_XT_MATCH_LIMIT=m +CONFIG_NETFILTER_XT_MATCH_MAC=m +CONFIG_NETFILTER_XT_MATCH_MARK=m +CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m +CONFIG_NETFILTER_XT_MATCH_OWNER=m +CONFIG_NETFILTER_XT_MATCH_POLICY=m +CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m +CONFIG_NETFILTER_XT_MATCH_QUOTA=m +CONFIG_NETFILTER_XT_MATCH_RATEEST=m +CONFIG_NETFILTER_XT_MATCH_REALM=m +CONFIG_NETFILTER_XT_MATCH_RECENT=m +CONFIG_NETFILTER_XT_MATCH_SOCKET=m +CONFIG_NETFILTER_XT_MATCH_STATE=m +CONFIG_NETFILTER_XT_MATCH_STATISTIC=m +CONFIG_NETFILTER_XT_MATCH_STRING=m +CONFIG_NETFILTER_XT_MATCH_TCPMSS=m +CONFIG_NETFILTER_XT_MATCH_TIME=m +CONFIG_NETFILTER_XT_MATCH_U32=m +CONFIG_IP_VS=m +CONFIG_IP_VS_IPV6=y +CONFIG_IP_VS_PROTO_TCP=y +CONFIG_IP_VS_PROTO_UDP=y +CONFIG_IP_VS_PROTO_ESP=y +CONFIG_IP_VS_PROTO_AH=y +CONFIG_IP_VS_RR=m +CONFIG_IP_VS_WRR=m +CONFIG_IP_VS_LC=m +CONFIG_IP_VS_WLC=m +CONFIG_IP_VS_LBLC=m +CONFIG_IP_VS_LBLCR=m +CONFIG_IP_VS_DH=m +CONFIG_IP_VS_SH=m +CONFIG_IP_VS_SED=m +CONFIG_IP_VS_NQ=m +CONFIG_NF_CONNTRACK_IPV4=m +CONFIG_IP_NF_QUEUE=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_AH=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_CLUSTERIP=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_TTL=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m +CONFIG_NF_CONNTRACK_IPV6=m +CONFIG_IP6_NF_MATCH_AH=m +CONFIG_IP6_NF_MATCH_EUI64=m +CONFIG_IP6_NF_MATCH_FRAG=m +CONFIG_IP6_NF_MATCH_OPTS=m +CONFIG_IP6_NF_MATCH_HL=m +CONFIG_IP6_NF_MATCH_IPV6HEADER=m +CONFIG_IP6_NF_MATCH_MH=m +CONFIG_IP6_NF_MATCH_RT=m +CONFIG_IP6_NF_TARGET_HL=m +CONFIG_IP6_NF_FILTER=m +CONFIG_IP6_NF_TARGET_REJECT=m +CONFIG_IP6_NF_MANGLE=m +CONFIG_IP6_NF_RAW=m +CONFIG_BRIDGE_NF_EBTABLES=m +CONFIG_BRIDGE_EBT_BROUTE=m +CONFIG_BRIDGE_EBT_T_FILTER=m +CONFIG_BRIDGE_EBT_T_NAT=m +CONFIG_BRIDGE_EBT_802_3=m +CONFIG_BRIDGE_EBT_AMONG=m +CONFIG_BRIDGE_EBT_ARP=m +CONFIG_BRIDGE_EBT_IP=m +CONFIG_BRIDGE_EBT_IP6=m +CONFIG_BRIDGE_EBT_LIMIT=m +CONFIG_BRIDGE_EBT_MARK=m +CONFIG_BRIDGE_EBT_PKTTYPE=m +CONFIG_BRIDGE_EBT_STP=m +CONFIG_BRIDGE_EBT_VLAN=m +CONFIG_BRIDGE_EBT_ARPREPLY=m +CONFIG_BRIDGE_EBT_DNAT=m +CONFIG_BRIDGE_EBT_MARK_T=m +CONFIG_BRIDGE_EBT_REDIRECT=m +CONFIG_BRIDGE_EBT_SNAT=m +CONFIG_BRIDGE_EBT_LOG=m +CONFIG_BRIDGE_EBT_ULOG=m +CONFIG_BRIDGE_EBT_NFLOG=m +CONFIG_IP_SCTP=m +CONFIG_BRIDGE=m +CONFIG_VLAN_8021Q=m +CONFIG_VLAN_8021Q_GVRP=y +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +CONFIG_IPDDP_DECAP=y +CONFIG_PHONET=m +CONFIG_NET_SCHED=y +CONFIG_NET_SCH_CBQ=m +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_DSMARK=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_TCINDEX=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_NET_CLS_RSVP=m +CONFIG_NET_CLS_RSVP6=m +CONFIG_NET_CLS_FLOW=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=y +CONFIG_NET_ACT_GACT=m +CONFIG_GACT_PROB=y +CONFIG_NET_ACT_MIRRED=m +CONFIG_NET_ACT_IPT=m +CONFIG_NET_ACT_NAT=m +CONFIG_NET_ACT_PEDIT=m +CONFIG_NET_ACT_SIMP=m +CONFIG_NET_ACT_SKBEDIT=m +CONFIG_NET_CLS_IND=y +CONFIG_CFG80211=m +CONFIG_MAC80211=m +CONFIG_MAC80211_RC_PID=y +CONFIG_MAC80211_RC_DEFAULT_PID=y +CONFIG_MAC80211_MESH=y +CONFIG_RFKILL=m +CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" +CONFIG_CONNECTOR=m +CONFIG_MTD=y +CONFIG_MTD_CHAR=y +CONFIG_MTD_BLOCK=y +CONFIG_MTD_OOPS=m +CONFIG_MTD_CFI=y +CONFIG_MTD_CFI_INTELEXT=y +CONFIG_MTD_CFI_AMDSTD=y +CONFIG_MTD_CFI_STAA=y +CONFIG_MTD_PHYSMAP=y +CONFIG_MTD_UBI=m +CONFIG_MTD_UBI_GLUEBI=m +CONFIG_BLK_DEV_FD=m +CONFIG_BLK_DEV_UMEM=m +CONFIG_BLK_DEV_LOOP=m +CONFIG_BLK_DEV_CRYPTOLOOP=m +CONFIG_BLK_DEV_NBD=m +CONFIG_BLK_DEV_RAM=y +CONFIG_CDROM_PKTCDVD=m +CONFIG_ATA_OVER_ETH=m +CONFIG_IDE=y +CONFIG_BLK_DEV_IDECD=y +CONFIG_IDE_GENERIC=y +CONFIG_BLK_DEV_GENERIC=y +CONFIG_BLK_DEV_PIIX=y +CONFIG_BLK_DEV_IT8213=m +CONFIG_BLK_DEV_TC86C001=m +CONFIG_RAID_ATTRS=m +CONFIG_SCSI=m +CONFIG_SCSI_TGT=m +CONFIG_BLK_DEV_SD=m +CONFIG_CHR_DEV_ST=m +CONFIG_CHR_DEV_OSST=m +CONFIG_BLK_DEV_SR=m +CONFIG_BLK_DEV_SR_VENDOR=y +CONFIG_CHR_DEV_SG=m +CONFIG_SCSI_MULTI_LUN=y +CONFIG_SCSI_CONSTANTS=y +CONFIG_SCSI_LOGGING=y +CONFIG_SCSI_SCAN_ASYNC=y +CONFIG_SCSI_FC_ATTRS=m +CONFIG_ISCSI_TCP=m +CONFIG_BLK_DEV_3W_XXXX_RAID=m +CONFIG_SCSI_3W_9XXX=m +CONFIG_SCSI_ACARD=m +CONFIG_SCSI_AACRAID=m +CONFIG_SCSI_AIC7XXX=m +CONFIG_AIC7XXX_RESET_DELAY_MS=15000 +# CONFIG_AIC7XXX_DEBUG_ENABLE is not set +CONFIG_MD=y +CONFIG_BLK_DEV_MD=m +CONFIG_MD_LINEAR=m +CONFIG_MD_RAID0=m +CONFIG_MD_RAID1=m +CONFIG_MD_RAID10=m +CONFIG_MD_RAID456=m +CONFIG_MD_MULTIPATH=m +CONFIG_MD_FAULTY=m +CONFIG_BLK_DEV_DM=m +CONFIG_DM_CRYPT=m +CONFIG_DM_SNAPSHOT=m +CONFIG_DM_MIRROR=m +CONFIG_DM_ZERO=m +CONFIG_DM_MULTIPATH=m +CONFIG_NETDEVICES=y +CONFIG_BONDING=m +CONFIG_DUMMY=m +CONFIG_EQUALIZER=m +CONFIG_IFB=m +CONFIG_MACVLAN=m +CONFIG_TUN=m +CONFIG_VETH=m +CONFIG_PCNET32=y +CONFIG_CHELSIO_T3=m +CONFIG_AX88796=m +CONFIG_NETXEN_NIC=m +CONFIG_TC35815=m +CONFIG_MARVELL_PHY=m +CONFIG_DAVICOM_PHY=m +CONFIG_QSEMI_PHY=m +CONFIG_LXT_PHY=m +CONFIG_CICADA_PHY=m +CONFIG_VITESSE_PHY=m +CONFIG_SMSC_PHY=m +CONFIG_BROADCOM_PHY=m +CONFIG_ICPLUS_PHY=m +CONFIG_REALTEK_PHY=m +CONFIG_ATMEL=m +CONFIG_PCI_ATMEL=m +CONFIG_PRISM54=m +CONFIG_HOSTAP=m +CONFIG_HOSTAP_FIRMWARE=y +CONFIG_HOSTAP_FIRMWARE_NVRAM=y +CONFIG_HOSTAP_PLX=m +CONFIG_HOSTAP_PCI=m +CONFIG_IPW2100=m +CONFIG_IPW2100_MONITOR=y +CONFIG_LIBERTAS=m +# CONFIG_INPUT_KEYBOARD is not set +# CONFIG_INPUT_MOUSE is not set +# CONFIG_SERIO_I8042 is not set +CONFIG_VT_HW_CONSOLE_BINDING=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +# CONFIG_HWMON is not set +CONFIG_FB=y +CONFIG_FB_CIRRUS=y +# CONFIG_VGA_CONSOLE is not set +CONFIG_FRAMEBUFFER_CONSOLE=y +CONFIG_HID=m +CONFIG_RTC_CLASS=y +CONFIG_RTC_DRV_CMOS=y +CONFIG_UIO=m +CONFIG_UIO_CIF=m +CONFIG_EXT2_FS=y +CONFIG_EXT3_FS=y +CONFIG_REISERFS_FS=m +CONFIG_REISERFS_PROC_INFO=y +CONFIG_REISERFS_FS_XATTR=y +CONFIG_REISERFS_FS_POSIX_ACL=y +CONFIG_REISERFS_FS_SECURITY=y +CONFIG_JFS_FS=m +CONFIG_JFS_POSIX_ACL=y +CONFIG_JFS_SECURITY=y +CONFIG_XFS_FS=m +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_QUOTA=y +CONFIG_QFMT_V2=y +CONFIG_FUSE_FS=m +CONFIG_ISO9660_FS=m +CONFIG_JOLIET=y +CONFIG_ZISOFS=y +CONFIG_UDF_FS=m +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_PROC_KCORE=y +CONFIG_TMPFS=y +CONFIG_CONFIGFS_FS=y +CONFIG_AFFS_FS=m +CONFIG_HFS_FS=m +CONFIG_HFSPLUS_FS=m +CONFIG_BEFS_FS=m +CONFIG_BFS_FS=m +CONFIG_EFS_FS=m +CONFIG_JFFS2_FS=m +CONFIG_JFFS2_FS_XATTR=y +CONFIG_JFFS2_COMPRESSION_OPTIONS=y +CONFIG_JFFS2_RUBIN=y +CONFIG_CRAMFS=m +CONFIG_VXFS_FS=m +CONFIG_MINIX_FS=m +CONFIG_ROMFS_FS=m +CONFIG_SYSV_FS=m +CONFIG_UFS_FS=m +CONFIG_NFS_FS=y +CONFIG_ROOT_NFS=y +CONFIG_NFSD=y +CONFIG_NFSD_V3=y +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_CODEPAGE_737=m +CONFIG_NLS_CODEPAGE_775=m +CONFIG_NLS_CODEPAGE_850=m +CONFIG_NLS_CODEPAGE_852=m +CONFIG_NLS_CODEPAGE_855=m +CONFIG_NLS_CODEPAGE_857=m +CONFIG_NLS_CODEPAGE_860=m +CONFIG_NLS_CODEPAGE_861=m +CONFIG_NLS_CODEPAGE_862=m +CONFIG_NLS_CODEPAGE_863=m +CONFIG_NLS_CODEPAGE_864=m +CONFIG_NLS_CODEPAGE_865=m +CONFIG_NLS_CODEPAGE_866=m +CONFIG_NLS_CODEPAGE_869=m +CONFIG_NLS_CODEPAGE_936=m +CONFIG_NLS_CODEPAGE_950=m +CONFIG_NLS_CODEPAGE_932=m +CONFIG_NLS_CODEPAGE_949=m +CONFIG_NLS_CODEPAGE_874=m +CONFIG_NLS_ISO8859_8=m +CONFIG_NLS_CODEPAGE_1250=m +CONFIG_NLS_CODEPAGE_1251=m +CONFIG_NLS_ASCII=m +CONFIG_NLS_ISO8859_1=m +CONFIG_NLS_ISO8859_2=m +CONFIG_NLS_ISO8859_3=m +CONFIG_NLS_ISO8859_4=m +CONFIG_NLS_ISO8859_5=m +CONFIG_NLS_ISO8859_6=m +CONFIG_NLS_ISO8859_7=m +CONFIG_NLS_ISO8859_9=m +CONFIG_NLS_ISO8859_13=m +CONFIG_NLS_ISO8859_14=m +CONFIG_NLS_ISO8859_15=m +CONFIG_NLS_KOI8_R=m +CONFIG_NLS_KOI8_U=m +CONFIG_RCU_CPU_STALL_TIMEOUT=60 +CONFIG_ENABLE_DEFAULT_TRACERS=y +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_CRYPTD=m +CONFIG_CRYPTO_LRW=m +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_XCBC=m +CONFIG_CRYPTO_MD4=m +CONFIG_CRYPTO_SHA256=m +CONFIG_CRYPTO_SHA512=m +CONFIG_CRYPTO_TGR192=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_ANUBIS=m +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_CAMELLIA=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_FCRYPT=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SERPENT=m +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +# CONFIG_CRYPTO_ANSI_CPRNG is not set +CONFIG_CRC16=m +CONFIG_VIRTUALIZATION=y +CONFIG_KVM=m +CONFIG_KVM_MIPS_DYN_TRANS=y +CONFIG_KVM_MIPS_DEBUG_COP0_COUNTERS=y +CONFIG_VHOST_NET=m diff --git a/arch/mips/configs/malta_kvm_guest_defconfig b/arch/mips/configs/malta_kvm_guest_defconfig new file mode 100644 index 000000000000..2b8558b71080 --- /dev/null +++ b/arch/mips/configs/malta_kvm_guest_defconfig @@ -0,0 +1,453 @@ +CONFIG_MIPS_MALTA=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_CPU_MIPS32_R2=y +CONFIG_KVM_GUEST=y +CONFIG_PAGE_SIZE_16KB=y +CONFIG_HZ_100=y +CONFIG_SYSVIPC=y +CONFIG_NO_HZ=y +CONFIG_HIGH_RES_TIMERS=y +CONFIG_LOG_BUF_SHIFT=15 +CONFIG_NAMESPACES=y +CONFIG_RELAY=y +CONFIG_BLK_DEV_INITRD=y +CONFIG_EXPERT=y +# CONFIG_COMPAT_BRK is not set +CONFIG_SLAB=y +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODVERSIONS=y +CONFIG_MODULE_SRCVERSION_ALL=y +CONFIG_PCI=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=y +CONFIG_NET_KEY_MIGRATE=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +CONFIG_NET_IPIP=m +CONFIG_IP_MROUTE=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +CONFIG_INET_XFRM_MODE_TRANSPORT=m +CONFIG_INET_XFRM_MODE_TUNNEL=m +CONFIG_TCP_MD5SIG=y +CONFIG_IPV6_PRIVACY=y +CONFIG_IPV6_ROUTER_PREF=y +CONFIG_IPV6_ROUTE_INFO=y +CONFIG_IPV6_OPTIMISTIC_DAD=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_TUNNEL=m +CONFIG_IPV6_MROUTE=y +CONFIG_IPV6_PIMSM_V2=y +CONFIG_NETWORK_SECMARK=y +CONFIG_NETFILTER=y +CONFIG_NF_CONNTRACK=m +CONFIG_NF_CONNTRACK_SECMARK=y +CONFIG_NF_CONNTRACK_EVENTS=y +CONFIG_NF_CT_PROTO_DCCP=m +CONFIG_NF_CT_PROTO_UDPLITE=m +CONFIG_NF_CONNTRACK_AMANDA=m +CONFIG_NF_CONNTRACK_FTP=m +CONFIG_NF_CONNTRACK_H323=m +CONFIG_NF_CONNTRACK_IRC=m +CONFIG_NF_CONNTRACK_PPTP=m +CONFIG_NF_CONNTRACK_SANE=m +CONFIG_NF_CONNTRACK_SIP=m +CONFIG_NF_CONNTRACK_TFTP=m +CONFIG_NF_CT_NETLINK=m +CONFIG_NETFILTER_TPROXY=m +CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m +CONFIG_NETFILTER_XT_TARGET_CONNMARK=m +CONFIG_NETFILTER_XT_TARGET_MARK=m +CONFIG_NETFILTER_XT_TARGET_NFLOG=m +CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m +CONFIG_NETFILTER_XT_TARGET_TPROXY=m +CONFIG_NETFILTER_XT_TARGET_TRACE=m +CONFIG_NETFILTER_XT_TARGET_SECMARK=m +CONFIG_NETFILTER_XT_TARGET_TCPMSS=m +CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m +CONFIG_NETFILTER_XT_MATCH_COMMENT=m +CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m +CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m +CONFIG_NETFILTER_XT_MATCH_CONNMARK=m +CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m +CONFIG_NETFILTER_XT_MATCH_DCCP=m +CONFIG_NETFILTER_XT_MATCH_ESP=m +CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m +CONFIG_NETFILTER_XT_MATCH_HELPER=m +CONFIG_NETFILTER_XT_MATCH_IPRANGE=m +CONFIG_NETFILTER_XT_MATCH_LENGTH=m +CONFIG_NETFILTER_XT_MATCH_LIMIT=m +CONFIG_NETFILTER_XT_MATCH_MAC=m +CONFIG_NETFILTER_XT_MATCH_MARK=m +CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m +CONFIG_NETFILTER_XT_MATCH_OWNER=m +CONFIG_NETFILTER_XT_MATCH_POLICY=m +CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m +CONFIG_NETFILTER_XT_MATCH_QUOTA=m +CONFIG_NETFILTER_XT_MATCH_RATEEST=m +CONFIG_NETFILTER_XT_MATCH_REALM=m +CONFIG_NETFILTER_XT_MATCH_RECENT=m +CONFIG_NETFILTER_XT_MATCH_SOCKET=m +CONFIG_NETFILTER_XT_MATCH_STATE=m +CONFIG_NETFILTER_XT_MATCH_STATISTIC=m +CONFIG_NETFILTER_XT_MATCH_STRING=m +CONFIG_NETFILTER_XT_MATCH_TCPMSS=m +CONFIG_NETFILTER_XT_MATCH_TIME=m +CONFIG_NETFILTER_XT_MATCH_U32=m +CONFIG_IP_VS=m +CONFIG_IP_VS_IPV6=y +CONFIG_IP_VS_PROTO_TCP=y +CONFIG_IP_VS_PROTO_UDP=y +CONFIG_IP_VS_PROTO_ESP=y +CONFIG_IP_VS_PROTO_AH=y +CONFIG_IP_VS_RR=m +CONFIG_IP_VS_WRR=m +CONFIG_IP_VS_LC=m +CONFIG_IP_VS_WLC=m +CONFIG_IP_VS_LBLC=m +CONFIG_IP_VS_LBLCR=m +CONFIG_IP_VS_DH=m +CONFIG_IP_VS_SH=m +CONFIG_IP_VS_SED=m +CONFIG_IP_VS_NQ=m +CONFIG_NF_CONNTRACK_IPV4=m +CONFIG_IP_NF_QUEUE=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_AH=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_CLUSTERIP=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_TTL=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m +CONFIG_NF_CONNTRACK_IPV6=m +CONFIG_IP6_NF_MATCH_AH=m +CONFIG_IP6_NF_MATCH_EUI64=m +CONFIG_IP6_NF_MATCH_FRAG=m +CONFIG_IP6_NF_MATCH_OPTS=m +CONFIG_IP6_NF_MATCH_HL=m +CONFIG_IP6_NF_MATCH_IPV6HEADER=m +CONFIG_IP6_NF_MATCH_MH=m +CONFIG_IP6_NF_MATCH_RT=m +CONFIG_IP6_NF_TARGET_HL=m +CONFIG_IP6_NF_FILTER=m +CONFIG_IP6_NF_TARGET_REJECT=m +CONFIG_IP6_NF_MANGLE=m +CONFIG_IP6_NF_RAW=m +CONFIG_BRIDGE_NF_EBTABLES=m +CONFIG_BRIDGE_EBT_BROUTE=m +CONFIG_BRIDGE_EBT_T_FILTER=m +CONFIG_BRIDGE_EBT_T_NAT=m +CONFIG_BRIDGE_EBT_802_3=m +CONFIG_BRIDGE_EBT_AMONG=m +CONFIG_BRIDGE_EBT_ARP=m +CONFIG_BRIDGE_EBT_IP=m +CONFIG_BRIDGE_EBT_IP6=m +CONFIG_BRIDGE_EBT_LIMIT=m +CONFIG_BRIDGE_EBT_MARK=m +CONFIG_BRIDGE_EBT_PKTTYPE=m +CONFIG_BRIDGE_EBT_STP=m +CONFIG_BRIDGE_EBT_VLAN=m +CONFIG_BRIDGE_EBT_ARPREPLY=m +CONFIG_BRIDGE_EBT_DNAT=m +CONFIG_BRIDGE_EBT_MARK_T=m +CONFIG_BRIDGE_EBT_REDIRECT=m +CONFIG_BRIDGE_EBT_SNAT=m +CONFIG_BRIDGE_EBT_LOG=m +CONFIG_BRIDGE_EBT_ULOG=m +CONFIG_BRIDGE_EBT_NFLOG=m +CONFIG_IP_SCTP=m +CONFIG_BRIDGE=m +CONFIG_VLAN_8021Q=m +CONFIG_VLAN_8021Q_GVRP=y +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +CONFIG_IPDDP_DECAP=y +CONFIG_PHONET=m +CONFIG_NET_SCHED=y +CONFIG_NET_SCH_CBQ=m +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_DSMARK=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_TCINDEX=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_NET_CLS_RSVP=m +CONFIG_NET_CLS_RSVP6=m +CONFIG_NET_CLS_FLOW=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=y +CONFIG_NET_ACT_GACT=m +CONFIG_GACT_PROB=y +CONFIG_NET_ACT_MIRRED=m +CONFIG_NET_ACT_IPT=m +CONFIG_NET_ACT_NAT=m +CONFIG_NET_ACT_PEDIT=m +CONFIG_NET_ACT_SIMP=m +CONFIG_NET_ACT_SKBEDIT=m +CONFIG_NET_CLS_IND=y +CONFIG_CFG80211=m +CONFIG_MAC80211=m +CONFIG_MAC80211_RC_PID=y +CONFIG_MAC80211_RC_DEFAULT_PID=y +CONFIG_MAC80211_MESH=y +CONFIG_RFKILL=m +CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" +CONFIG_CONNECTOR=m +CONFIG_MTD=y +CONFIG_MTD_CHAR=y +CONFIG_MTD_BLOCK=y +CONFIG_MTD_OOPS=m +CONFIG_MTD_CFI=y +CONFIG_MTD_CFI_INTELEXT=y +CONFIG_MTD_CFI_AMDSTD=y +CONFIG_MTD_CFI_STAA=y +CONFIG_MTD_PHYSMAP=y +CONFIG_MTD_UBI=m +CONFIG_MTD_UBI_GLUEBI=m +CONFIG_BLK_DEV_FD=m +CONFIG_BLK_DEV_UMEM=m +CONFIG_BLK_DEV_LOOP=m +CONFIG_BLK_DEV_CRYPTOLOOP=m +CONFIG_BLK_DEV_NBD=m +CONFIG_BLK_DEV_RAM=y +CONFIG_CDROM_PKTCDVD=m +CONFIG_ATA_OVER_ETH=m +CONFIG_VIRTIO_BLK=y +CONFIG_IDE=y +CONFIG_BLK_DEV_IDECD=y +CONFIG_IDE_GENERIC=y +CONFIG_BLK_DEV_GENERIC=y +CONFIG_BLK_DEV_PIIX=y +CONFIG_BLK_DEV_IT8213=m +CONFIG_BLK_DEV_TC86C001=m +CONFIG_RAID_ATTRS=m +CONFIG_SCSI=m +CONFIG_SCSI_TGT=m +CONFIG_BLK_DEV_SD=m +CONFIG_CHR_DEV_ST=m +CONFIG_CHR_DEV_OSST=m +CONFIG_BLK_DEV_SR=m +CONFIG_BLK_DEV_SR_VENDOR=y +CONFIG_CHR_DEV_SG=m +CONFIG_SCSI_MULTI_LUN=y +CONFIG_SCSI_CONSTANTS=y +CONFIG_SCSI_LOGGING=y +CONFIG_SCSI_SCAN_ASYNC=y +CONFIG_SCSI_FC_ATTRS=m +CONFIG_ISCSI_TCP=m +CONFIG_BLK_DEV_3W_XXXX_RAID=m +CONFIG_SCSI_3W_9XXX=m +CONFIG_SCSI_ACARD=m +CONFIG_SCSI_AACRAID=m +CONFIG_SCSI_AIC7XXX=m +CONFIG_AIC7XXX_RESET_DELAY_MS=15000 +# CONFIG_AIC7XXX_DEBUG_ENABLE is not set +CONFIG_MD=y +CONFIG_BLK_DEV_MD=m +CONFIG_MD_LINEAR=m +CONFIG_MD_RAID0=m +CONFIG_MD_RAID1=m +CONFIG_MD_RAID10=m +CONFIG_MD_RAID456=m +CONFIG_MD_MULTIPATH=m +CONFIG_MD_FAULTY=m +CONFIG_BLK_DEV_DM=m +CONFIG_DM_CRYPT=m +CONFIG_DM_SNAPSHOT=m +CONFIG_DM_MIRROR=m +CONFIG_DM_ZERO=m +CONFIG_DM_MULTIPATH=m +CONFIG_NETDEVICES=y +CONFIG_BONDING=m +CONFIG_DUMMY=m +CONFIG_EQUALIZER=m +CONFIG_IFB=m +CONFIG_MACVLAN=m +CONFIG_TUN=m +CONFIG_VETH=m +CONFIG_VIRTIO_NET=y +CONFIG_PCNET32=y +CONFIG_CHELSIO_T3=m +CONFIG_AX88796=m +CONFIG_NETXEN_NIC=m +CONFIG_TC35815=m +CONFIG_MARVELL_PHY=m +CONFIG_DAVICOM_PHY=m +CONFIG_QSEMI_PHY=m +CONFIG_LXT_PHY=m +CONFIG_CICADA_PHY=m +CONFIG_VITESSE_PHY=m +CONFIG_SMSC_PHY=m +CONFIG_BROADCOM_PHY=m +CONFIG_ICPLUS_PHY=m +CONFIG_REALTEK_PHY=m +CONFIG_ATMEL=m +CONFIG_PCI_ATMEL=m +CONFIG_PRISM54=m +CONFIG_HOSTAP=m +CONFIG_HOSTAP_FIRMWARE=y +CONFIG_HOSTAP_FIRMWARE_NVRAM=y +CONFIG_HOSTAP_PLX=m +CONFIG_HOSTAP_PCI=m +CONFIG_IPW2100=m +CONFIG_IPW2100_MONITOR=y +CONFIG_LIBERTAS=m +# CONFIG_INPUT_KEYBOARD is not set +# CONFIG_INPUT_MOUSE is not set +# CONFIG_SERIO_I8042 is not set +CONFIG_VT_HW_CONSOLE_BINDING=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +# CONFIG_HWMON is not set +CONFIG_FB=y +CONFIG_FB_CIRRUS=y +# CONFIG_VGA_CONSOLE is not set +CONFIG_FRAMEBUFFER_CONSOLE=y +CONFIG_HID=m +CONFIG_RTC_CLASS=y +CONFIG_RTC_DRV_CMOS=y +CONFIG_UIO=m +CONFIG_UIO_CIF=m +CONFIG_VIRTIO_PCI=y +CONFIG_VIRTIO_BALLOON=y +CONFIG_VIRTIO_MMIO=y +CONFIG_EXT2_FS=y +CONFIG_EXT3_FS=y +CONFIG_REISERFS_FS=m +CONFIG_REISERFS_PROC_INFO=y +CONFIG_REISERFS_FS_XATTR=y +CONFIG_REISERFS_FS_POSIX_ACL=y +CONFIG_REISERFS_FS_SECURITY=y +CONFIG_JFS_FS=m +CONFIG_JFS_POSIX_ACL=y +CONFIG_JFS_SECURITY=y +CONFIG_XFS_FS=m +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_QUOTA=y +CONFIG_QFMT_V2=y +CONFIG_FUSE_FS=m +CONFIG_ISO9660_FS=m +CONFIG_JOLIET=y +CONFIG_ZISOFS=y +CONFIG_UDF_FS=m +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_PROC_KCORE=y +CONFIG_TMPFS=y +CONFIG_AFFS_FS=m +CONFIG_HFS_FS=m +CONFIG_HFSPLUS_FS=m +CONFIG_BEFS_FS=m +CONFIG_BFS_FS=m +CONFIG_EFS_FS=m +CONFIG_JFFS2_FS=m +CONFIG_JFFS2_FS_XATTR=y +CONFIG_JFFS2_COMPRESSION_OPTIONS=y +CONFIG_JFFS2_RUBIN=y +CONFIG_CRAMFS=m +CONFIG_VXFS_FS=m +CONFIG_MINIX_FS=m +CONFIG_ROMFS_FS=m +CONFIG_SYSV_FS=m +CONFIG_UFS_FS=m +CONFIG_NFS_FS=y +CONFIG_ROOT_NFS=y +CONFIG_NFSD=y +CONFIG_NFSD_V3=y +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_CODEPAGE_737=m +CONFIG_NLS_CODEPAGE_775=m +CONFIG_NLS_CODEPAGE_850=m +CONFIG_NLS_CODEPAGE_852=m +CONFIG_NLS_CODEPAGE_855=m +CONFIG_NLS_CODEPAGE_857=m +CONFIG_NLS_CODEPAGE_860=m +CONFIG_NLS_CODEPAGE_861=m +CONFIG_NLS_CODEPAGE_862=m +CONFIG_NLS_CODEPAGE_863=m +CONFIG_NLS_CODEPAGE_864=m +CONFIG_NLS_CODEPAGE_865=m +CONFIG_NLS_CODEPAGE_866=m +CONFIG_NLS_CODEPAGE_869=m +CONFIG_NLS_CODEPAGE_936=m +CONFIG_NLS_CODEPAGE_950=m +CONFIG_NLS_CODEPAGE_932=m +CONFIG_NLS_CODEPAGE_949=m +CONFIG_NLS_CODEPAGE_874=m +CONFIG_NLS_ISO8859_8=m +CONFIG_NLS_CODEPAGE_1250=m +CONFIG_NLS_CODEPAGE_1251=m +CONFIG_NLS_ASCII=m +CONFIG_NLS_ISO8859_1=m +CONFIG_NLS_ISO8859_2=m +CONFIG_NLS_ISO8859_3=m +CONFIG_NLS_ISO8859_4=m +CONFIG_NLS_ISO8859_5=m +CONFIG_NLS_ISO8859_6=m +CONFIG_NLS_ISO8859_7=m +CONFIG_NLS_ISO8859_9=m +CONFIG_NLS_ISO8859_13=m +CONFIG_NLS_ISO8859_14=m +CONFIG_NLS_ISO8859_15=m +CONFIG_NLS_KOI8_R=m +CONFIG_NLS_KOI8_U=m +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_CRYPTD=m +CONFIG_CRYPTO_LRW=m +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_XCBC=m +CONFIG_CRYPTO_MD4=m +CONFIG_CRYPTO_SHA256=m +CONFIG_CRYPTO_SHA512=m +CONFIG_CRYPTO_TGR192=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_ANUBIS=m +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_CAMELLIA=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_FCRYPT=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SERPENT=m +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +# CONFIG_CRYPTO_ANSI_CPRNG is not set +CONFIG_CRC16=m diff --git a/arch/mips/configs/maltaaprp_defconfig b/arch/mips/configs/maltaaprp_defconfig new file mode 100644 index 000000000000..93057a760dfa --- /dev/null +++ b/arch/mips/configs/maltaaprp_defconfig @@ -0,0 +1,195 @@ +CONFIG_MIPS_MALTA=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_CPU_MIPS32_R2=y +CONFIG_MIPS_VPE_LOADER=y +CONFIG_MIPS_VPE_APSP_API=y +CONFIG_HZ_100=y +CONFIG_LOCALVERSION="aprp" +CONFIG_SYSVIPC=y +CONFIG_POSIX_MQUEUE=y +CONFIG_AUDIT=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_LOG_BUF_SHIFT=15 +CONFIG_SYSCTL_SYSCALL=y +CONFIG_EMBEDDED=y +CONFIG_SLAB=y +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODVERSIONS=y +CONFIG_MODULE_SRCVERSION_ALL=y +# CONFIG_BLK_DEV_BSG is not set +CONFIG_PCI=y +# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set +CONFIG_NET=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +CONFIG_NET_IPIP=m +CONFIG_IP_MROUTE=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +# CONFIG_INET_LRO is not set +CONFIG_IPV6_PRIVACY=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_TUNNEL=m +CONFIG_BRIDGE=m +CONFIG_VLAN_8021Q=m +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +CONFIG_IPDDP_DECAP=y +CONFIG_NET_SCHED=y +CONFIG_NET_SCH_CBQ=m +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_DSMARK=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_TCINDEX=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_NET_CLS_RSVP=m +CONFIG_NET_CLS_RSVP6=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=y +CONFIG_NET_CLS_IND=y +# CONFIG_WIRELESS is not set +CONFIG_BLK_DEV_LOOP=y +CONFIG_BLK_DEV_CRYPTOLOOP=m +CONFIG_IDE=y +# CONFIG_IDE_PROC_FS is not set +# CONFIG_IDEPCI_PCIBUS_ORDER is not set +CONFIG_BLK_DEV_GENERIC=y +CONFIG_BLK_DEV_PIIX=y +CONFIG_SCSI=y +CONFIG_BLK_DEV_SD=y +CONFIG_CHR_DEV_SG=y +# CONFIG_SCSI_LOWLEVEL is not set +CONFIG_NETDEVICES=y +# CONFIG_NET_VENDOR_3COM is not set +# CONFIG_NET_VENDOR_ADAPTEC is not set +# CONFIG_NET_VENDOR_ALTEON is not set +CONFIG_PCNET32=y +# CONFIG_NET_VENDOR_ATHEROS is not set +# CONFIG_NET_VENDOR_BROADCOM is not set +# CONFIG_NET_VENDOR_BROCADE is not set +# CONFIG_NET_VENDOR_CHELSIO is not set +# CONFIG_NET_VENDOR_CISCO is not set +# CONFIG_NET_VENDOR_DEC is not set +# CONFIG_NET_VENDOR_DLINK is not set +# CONFIG_NET_VENDOR_EMULEX is not set +# CONFIG_NET_VENDOR_EXAR is not set +# CONFIG_NET_VENDOR_HP is not set +# CONFIG_NET_VENDOR_INTEL is not set +# CONFIG_NET_VENDOR_MARVELL is not set +# CONFIG_NET_VENDOR_MELLANOX is not set +# CONFIG_NET_VENDOR_MICREL is not set +# CONFIG_NET_VENDOR_MYRI is not set +# CONFIG_NET_VENDOR_NATSEMI is not set +# CONFIG_NET_VENDOR_NVIDIA is not set +# CONFIG_NET_VENDOR_OKI is not set +# CONFIG_NET_PACKET_ENGINE is not set +# CONFIG_NET_VENDOR_QLOGIC is not set +# CONFIG_NET_VENDOR_REALTEK is not set +# CONFIG_NET_VENDOR_RDC is not set +# CONFIG_NET_VENDOR_SEEQ is not set +# CONFIG_NET_VENDOR_SILAN is not set +# CONFIG_NET_VENDOR_SIS is not set +# CONFIG_NET_VENDOR_SMSC is not set +# CONFIG_NET_VENDOR_STMICRO is not set +# CONFIG_NET_VENDOR_SUN is not set +# CONFIG_NET_VENDOR_TEHUTI is not set +# CONFIG_NET_VENDOR_TI is not set +# CONFIG_NET_VENDOR_TOSHIBA is not set +# CONFIG_NET_VENDOR_VIA is not set +# CONFIG_WLAN is not set +# CONFIG_VT is not set +CONFIG_LEGACY_PTY_COUNT=16 +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_HW_RANDOM=y +# CONFIG_HWMON is not set +CONFIG_VIDEO_OUTPUT_CONTROL=m +CONFIG_FB=y +CONFIG_FIRMWARE_EDID=y +CONFIG_FB_MATROX=y +CONFIG_FB_MATROX_G=y +CONFIG_USB=y +CONFIG_USB_EHCI_HCD=y +# CONFIG_USB_EHCI_TT_NEWSCHED is not set +CONFIG_USB_UHCI_HCD=y +CONFIG_USB_STORAGE=y +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_TIMER=y +CONFIG_LEDS_TRIGGER_IDE_DISK=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=y +CONFIG_LEDS_TRIGGER_BACKLIGHT=y +CONFIG_LEDS_TRIGGER_DEFAULT_ON=y +CONFIG_RTC_CLASS=y +CONFIG_RTC_DRV_CMOS=y +CONFIG_EXT2_FS=y +CONFIG_EXT3_FS=y +# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set +CONFIG_XFS_FS=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_QUOTA=y +CONFIG_QFMT_V2=y +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_PROC_KCORE=y +CONFIG_TMPFS=y +CONFIG_NFS_FS=y +CONFIG_ROOT_NFS=y +CONFIG_CIFS=m +CONFIG_CIFS_WEAK_PW_HASH=y +CONFIG_CIFS_XATTR=y +CONFIG_CIFS_POSIX=y +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_ISO8859_1=m +# CONFIG_FTRACE is not set +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA512=m +CONFIG_CRYPTO_TGR192=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_ANUBIS=m +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SERPENT=m +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +# CONFIG_CRYPTO_ANSI_CPRNG is not set +# CONFIG_CRYPTO_HW is not set diff --git a/arch/mips/configs/maltasmtc_defconfig b/arch/mips/configs/maltasmtc_defconfig new file mode 100644 index 000000000000..4e54b75d89be --- /dev/null +++ b/arch/mips/configs/maltasmtc_defconfig @@ -0,0 +1,196 @@ +CONFIG_MIPS_MALTA=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_CPU_MIPS32_R2=y +CONFIG_MIPS_MT_SMTC=y +# CONFIG_MIPS_MT_FPAFF is not set +CONFIG_NR_CPUS=9 +CONFIG_HZ_48=y +CONFIG_LOCALVERSION="smtc" +CONFIG_SYSVIPC=y +CONFIG_POSIX_MQUEUE=y +CONFIG_AUDIT=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_LOG_BUF_SHIFT=15 +CONFIG_SYSCTL_SYSCALL=y +CONFIG_EMBEDDED=y +CONFIG_SLAB=y +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODVERSIONS=y +CONFIG_MODULE_SRCVERSION_ALL=y +# CONFIG_BLK_DEV_BSG is not set +CONFIG_PCI=y +# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set +CONFIG_NET=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +CONFIG_NET_IPIP=m +CONFIG_IP_MROUTE=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +# CONFIG_INET_LRO is not set +CONFIG_IPV6_PRIVACY=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_TUNNEL=m +CONFIG_BRIDGE=m +CONFIG_VLAN_8021Q=m +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +CONFIG_IPDDP_DECAP=y +CONFIG_NET_SCHED=y +CONFIG_NET_SCH_CBQ=m +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_DSMARK=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_TCINDEX=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_NET_CLS_RSVP=m +CONFIG_NET_CLS_RSVP6=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=y +CONFIG_NET_CLS_IND=y +# CONFIG_WIRELESS is not set +CONFIG_BLK_DEV_LOOP=y +CONFIG_BLK_DEV_CRYPTOLOOP=m +CONFIG_IDE=y +# CONFIG_IDE_PROC_FS is not set +# CONFIG_IDEPCI_PCIBUS_ORDER is not set +CONFIG_BLK_DEV_GENERIC=y +CONFIG_BLK_DEV_PIIX=y +CONFIG_SCSI=y +CONFIG_BLK_DEV_SD=y +CONFIG_CHR_DEV_SG=y +# CONFIG_SCSI_LOWLEVEL is not set +CONFIG_NETDEVICES=y +# CONFIG_NET_VENDOR_3COM is not set +# CONFIG_NET_VENDOR_ADAPTEC is not set +# CONFIG_NET_VENDOR_ALTEON is not set +CONFIG_PCNET32=y +# CONFIG_NET_VENDOR_ATHEROS is not set +# CONFIG_NET_VENDOR_BROADCOM is not set +# CONFIG_NET_VENDOR_BROCADE is not set +# CONFIG_NET_VENDOR_CHELSIO is not set +# CONFIG_NET_VENDOR_CISCO is not set +# CONFIG_NET_VENDOR_DEC is not set +# CONFIG_NET_VENDOR_DLINK is not set +# CONFIG_NET_VENDOR_EMULEX is not set +# CONFIG_NET_VENDOR_EXAR is not set +# CONFIG_NET_VENDOR_HP is not set +# CONFIG_NET_VENDOR_INTEL is not set +# CONFIG_NET_VENDOR_MARVELL is not set +# CONFIG_NET_VENDOR_MELLANOX is not set +# CONFIG_NET_VENDOR_MICREL is not set +# CONFIG_NET_VENDOR_MYRI is not set +# CONFIG_NET_VENDOR_NATSEMI is not set +# CONFIG_NET_VENDOR_NVIDIA is not set +# CONFIG_NET_VENDOR_OKI is not set +# CONFIG_NET_PACKET_ENGINE is not set +# CONFIG_NET_VENDOR_QLOGIC is not set +# CONFIG_NET_VENDOR_REALTEK is not set +# CONFIG_NET_VENDOR_RDC is not set +# CONFIG_NET_VENDOR_SEEQ is not set +# CONFIG_NET_VENDOR_SILAN is not set +# CONFIG_NET_VENDOR_SIS is not set +# CONFIG_NET_VENDOR_SMSC is not set +# CONFIG_NET_VENDOR_STMICRO is not set +# CONFIG_NET_VENDOR_SUN is not set +# CONFIG_NET_VENDOR_TEHUTI is not set +# CONFIG_NET_VENDOR_TI is not set +# CONFIG_NET_VENDOR_TOSHIBA is not set +# CONFIG_NET_VENDOR_VIA is not set +# CONFIG_WLAN is not set +# CONFIG_VT is not set +CONFIG_LEGACY_PTY_COUNT=16 +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_HW_RANDOM=y +# CONFIG_HWMON is not set +CONFIG_VIDEO_OUTPUT_CONTROL=m +CONFIG_FB=y +CONFIG_FIRMWARE_EDID=y +CONFIG_FB_MATROX=y +CONFIG_FB_MATROX_G=y +CONFIG_USB=y +CONFIG_USB_EHCI_HCD=y +# CONFIG_USB_EHCI_TT_NEWSCHED is not set +CONFIG_USB_UHCI_HCD=y +CONFIG_USB_STORAGE=y +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_TIMER=y +CONFIG_LEDS_TRIGGER_IDE_DISK=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=y +CONFIG_LEDS_TRIGGER_BACKLIGHT=y +CONFIG_LEDS_TRIGGER_DEFAULT_ON=y +CONFIG_RTC_CLASS=y +CONFIG_RTC_DRV_CMOS=y +CONFIG_EXT2_FS=y +CONFIG_EXT3_FS=y +# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set +CONFIG_XFS_FS=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_QUOTA=y +CONFIG_QFMT_V2=y +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_PROC_KCORE=y +CONFIG_TMPFS=y +CONFIG_NFS_FS=y +CONFIG_ROOT_NFS=y +CONFIG_CIFS=m +CONFIG_CIFS_WEAK_PW_HASH=y +CONFIG_CIFS_XATTR=y +CONFIG_CIFS_POSIX=y +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_ISO8859_1=m +# CONFIG_FTRACE is not set +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA512=m +CONFIG_CRYPTO_TGR192=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_ANUBIS=m +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SERPENT=m +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +# CONFIG_CRYPTO_ANSI_CPRNG is not set +# CONFIG_CRYPTO_HW is not set diff --git a/arch/mips/configs/maltasmvp_defconfig b/arch/mips/configs/maltasmvp_defconfig new file mode 100644 index 000000000000..8a666021b870 --- /dev/null +++ b/arch/mips/configs/maltasmvp_defconfig @@ -0,0 +1,199 @@ +CONFIG_MIPS_MALTA=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_CPU_MIPS32_R2=y +CONFIG_MIPS_MT_SMP=y +CONFIG_SCHED_SMT=y +CONFIG_MIPS_CMP=y +CONFIG_NR_CPUS=8 +CONFIG_HZ_100=y +CONFIG_LOCALVERSION="cmp" +CONFIG_SYSVIPC=y +CONFIG_POSIX_MQUEUE=y +CONFIG_AUDIT=y +CONFIG_NO_HZ=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_LOG_BUF_SHIFT=15 +CONFIG_SYSCTL_SYSCALL=y +CONFIG_EMBEDDED=y +CONFIG_SLAB=y +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODVERSIONS=y +CONFIG_MODULE_SRCVERSION_ALL=y +# CONFIG_BLK_DEV_BSG is not set +CONFIG_PCI=y +# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set +CONFIG_NET=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +CONFIG_NET_IPIP=m +CONFIG_IP_MROUTE=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +# CONFIG_INET_LRO is not set +CONFIG_IPV6_PRIVACY=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_TUNNEL=m +CONFIG_BRIDGE=m +CONFIG_VLAN_8021Q=m +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +CONFIG_IPDDP_DECAP=y +CONFIG_NET_SCHED=y +CONFIG_NET_SCH_CBQ=m +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_DSMARK=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_TCINDEX=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_NET_CLS_RSVP=m +CONFIG_NET_CLS_RSVP6=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=y +CONFIG_NET_CLS_IND=y +# CONFIG_WIRELESS is not set +CONFIG_BLK_DEV_LOOP=y +CONFIG_BLK_DEV_CRYPTOLOOP=m +CONFIG_IDE=y +# CONFIG_IDE_PROC_FS is not set +# CONFIG_IDEPCI_PCIBUS_ORDER is not set +CONFIG_BLK_DEV_GENERIC=y +CONFIG_BLK_DEV_PIIX=y +CONFIG_SCSI=y +CONFIG_BLK_DEV_SD=y +CONFIG_CHR_DEV_SG=y +# CONFIG_SCSI_LOWLEVEL is not set +CONFIG_NETDEVICES=y +# CONFIG_NET_VENDOR_3COM is not set +# CONFIG_NET_VENDOR_ADAPTEC is not set +# CONFIG_NET_VENDOR_ALTEON is not set +CONFIG_PCNET32=y +# CONFIG_NET_VENDOR_ATHEROS is not set +# CONFIG_NET_VENDOR_BROADCOM is not set +# CONFIG_NET_VENDOR_BROCADE is not set +# CONFIG_NET_VENDOR_CHELSIO is not set +# CONFIG_NET_VENDOR_CISCO is not set +# CONFIG_NET_VENDOR_DEC is not set +# CONFIG_NET_VENDOR_DLINK is not set +# CONFIG_NET_VENDOR_EMULEX is not set +# CONFIG_NET_VENDOR_EXAR is not set +# CONFIG_NET_VENDOR_HP is not set +# CONFIG_NET_VENDOR_INTEL is not set +# CONFIG_NET_VENDOR_MARVELL is not set +# CONFIG_NET_VENDOR_MELLANOX is not set +# CONFIG_NET_VENDOR_MICREL is not set +# CONFIG_NET_VENDOR_MYRI is not set +# CONFIG_NET_VENDOR_NATSEMI is not set +# CONFIG_NET_VENDOR_NVIDIA is not set +# CONFIG_NET_VENDOR_OKI is not set +# CONFIG_NET_PACKET_ENGINE is not set +# CONFIG_NET_VENDOR_QLOGIC is not set +# CONFIG_NET_VENDOR_REALTEK is not set +# CONFIG_NET_VENDOR_RDC is not set +# CONFIG_NET_VENDOR_SEEQ is not set +# CONFIG_NET_VENDOR_SILAN is not set +# CONFIG_NET_VENDOR_SIS is not set +# CONFIG_NET_VENDOR_SMSC is not set +# CONFIG_NET_VENDOR_STMICRO is not set +# CONFIG_NET_VENDOR_SUN is not set +# CONFIG_NET_VENDOR_TEHUTI is not set +# CONFIG_NET_VENDOR_TI is not set +# CONFIG_NET_VENDOR_TOSHIBA is not set +# CONFIG_NET_VENDOR_VIA is not set +# CONFIG_NET_VENDOR_WIZNET is not set +# CONFIG_WLAN is not set +# CONFIG_VT is not set +CONFIG_LEGACY_PTY_COUNT=4 +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_HW_RANDOM=y +# CONFIG_HWMON is not set +CONFIG_VIDEO_OUTPUT_CONTROL=m +CONFIG_FB=y +CONFIG_FIRMWARE_EDID=y +CONFIG_FB_MATROX=y +CONFIG_FB_MATROX_G=y +CONFIG_USB=y +CONFIG_USB_EHCI_HCD=y +# CONFIG_USB_EHCI_TT_NEWSCHED is not set +CONFIG_USB_UHCI_HCD=y +CONFIG_USB_STORAGE=y +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_TIMER=y +CONFIG_LEDS_TRIGGER_IDE_DISK=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=y +CONFIG_LEDS_TRIGGER_BACKLIGHT=y +CONFIG_LEDS_TRIGGER_DEFAULT_ON=y +CONFIG_RTC_CLASS=y +CONFIG_RTC_DRV_CMOS=y +CONFIG_EXT2_FS=y +CONFIG_EXT3_FS=y +# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set +CONFIG_XFS_FS=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_QUOTA=y +CONFIG_QFMT_V2=y +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_PROC_KCORE=y +CONFIG_TMPFS=y +CONFIG_NFS_FS=y +CONFIG_ROOT_NFS=y +CONFIG_CIFS=m +CONFIG_CIFS_WEAK_PW_HASH=y +CONFIG_CIFS_XATTR=y +CONFIG_CIFS_POSIX=y +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_ISO8859_1=m +# CONFIG_FTRACE is not set +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA512=m +CONFIG_CRYPTO_TGR192=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_ANUBIS=m +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SERPENT=m +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +# CONFIG_CRYPTO_ANSI_CPRNG is not set +# CONFIG_CRYPTO_HW is not set diff --git a/arch/mips/configs/maltaup_defconfig b/arch/mips/configs/maltaup_defconfig new file mode 100644 index 000000000000..9868fc9c1133 --- /dev/null +++ b/arch/mips/configs/maltaup_defconfig @@ -0,0 +1,194 @@ +CONFIG_MIPS_MALTA=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_CPU_MIPS32_R2=y +CONFIG_HZ_100=y +CONFIG_LOCALVERSION="up" +CONFIG_SYSVIPC=y +CONFIG_POSIX_MQUEUE=y +CONFIG_AUDIT=y +CONFIG_NO_HZ=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_LOG_BUF_SHIFT=15 +CONFIG_SYSCTL_SYSCALL=y +CONFIG_EMBEDDED=y +CONFIG_SLAB=y +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODVERSIONS=y +CONFIG_MODULE_SRCVERSION_ALL=y +# CONFIG_BLK_DEV_BSG is not set +CONFIG_PCI=y +# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set +CONFIG_NET=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +CONFIG_NET_IPIP=m +CONFIG_IP_MROUTE=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +# CONFIG_INET_LRO is not set +CONFIG_IPV6_PRIVACY=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_TUNNEL=m +CONFIG_BRIDGE=m +CONFIG_VLAN_8021Q=m +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +CONFIG_IPDDP_DECAP=y +CONFIG_NET_SCHED=y +CONFIG_NET_SCH_CBQ=m +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_DSMARK=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_TCINDEX=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_NET_CLS_RSVP=m +CONFIG_NET_CLS_RSVP6=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=y +CONFIG_NET_CLS_IND=y +# CONFIG_WIRELESS is not set +CONFIG_BLK_DEV_LOOP=y +CONFIG_BLK_DEV_CRYPTOLOOP=m +CONFIG_IDE=y +# CONFIG_IDE_PROC_FS is not set +# CONFIG_IDEPCI_PCIBUS_ORDER is not set +CONFIG_BLK_DEV_GENERIC=y +CONFIG_BLK_DEV_PIIX=y +CONFIG_SCSI=y +CONFIG_BLK_DEV_SD=y +CONFIG_CHR_DEV_SG=y +# CONFIG_SCSI_LOWLEVEL is not set +CONFIG_NETDEVICES=y +# CONFIG_NET_VENDOR_3COM is not set +# CONFIG_NET_VENDOR_ADAPTEC is not set +# CONFIG_NET_VENDOR_ALTEON is not set +CONFIG_PCNET32=y +# CONFIG_NET_VENDOR_ATHEROS is not set +# CONFIG_NET_VENDOR_BROADCOM is not set +# CONFIG_NET_VENDOR_BROCADE is not set +# CONFIG_NET_VENDOR_CHELSIO is not set +# CONFIG_NET_VENDOR_CISCO is not set +# CONFIG_NET_VENDOR_DEC is not set +# CONFIG_NET_VENDOR_DLINK is not set +# CONFIG_NET_VENDOR_EMULEX is not set +# CONFIG_NET_VENDOR_EXAR is not set +# CONFIG_NET_VENDOR_HP is not set +# CONFIG_NET_VENDOR_INTEL is not set +# CONFIG_NET_VENDOR_MARVELL is not set +# CONFIG_NET_VENDOR_MELLANOX is not set +# CONFIG_NET_VENDOR_MICREL is not set +# CONFIG_NET_VENDOR_MYRI is not set +# CONFIG_NET_VENDOR_NATSEMI is not set +# CONFIG_NET_VENDOR_NVIDIA is not set +# CONFIG_NET_VENDOR_OKI is not set +# CONFIG_NET_PACKET_ENGINE is not set +# CONFIG_NET_VENDOR_QLOGIC is not set +# CONFIG_NET_VENDOR_REALTEK is not set +# CONFIG_NET_VENDOR_RDC is not set +# CONFIG_NET_VENDOR_SEEQ is not set +# CONFIG_NET_VENDOR_SILAN is not set +# CONFIG_NET_VENDOR_SIS is not set +# CONFIG_NET_VENDOR_SMSC is not set +# CONFIG_NET_VENDOR_STMICRO is not set +# CONFIG_NET_VENDOR_SUN is not set +# CONFIG_NET_VENDOR_TEHUTI is not set +# CONFIG_NET_VENDOR_TI is not set +# CONFIG_NET_VENDOR_TOSHIBA is not set +# CONFIG_NET_VENDOR_VIA is not set +# CONFIG_WLAN is not set +# CONFIG_VT is not set +CONFIG_LEGACY_PTY_COUNT=16 +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_HW_RANDOM=y +# CONFIG_HWMON is not set +CONFIG_VIDEO_OUTPUT_CONTROL=m +CONFIG_FB=y +CONFIG_FIRMWARE_EDID=y +CONFIG_FB_MATROX=y +CONFIG_FB_MATROX_G=y +CONFIG_USB=y +CONFIG_USB_EHCI_HCD=y +# CONFIG_USB_EHCI_TT_NEWSCHED is not set +CONFIG_USB_UHCI_HCD=y +CONFIG_USB_STORAGE=y +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_TIMER=y +CONFIG_LEDS_TRIGGER_IDE_DISK=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=y +CONFIG_LEDS_TRIGGER_BACKLIGHT=y +CONFIG_LEDS_TRIGGER_DEFAULT_ON=y +CONFIG_RTC_CLASS=y +CONFIG_RTC_DRV_CMOS=y +CONFIG_EXT2_FS=y +CONFIG_EXT3_FS=y +# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set +CONFIG_XFS_FS=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_QUOTA=y +CONFIG_QFMT_V2=y +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_PROC_KCORE=y +CONFIG_TMPFS=y +CONFIG_NFS_FS=y +CONFIG_ROOT_NFS=y +CONFIG_CIFS=m +CONFIG_CIFS_WEAK_PW_HASH=y +CONFIG_CIFS_XATTR=y +CONFIG_CIFS_POSIX=y +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_ISO8859_1=m +# CONFIG_FTRACE is not set +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA512=m +CONFIG_CRYPTO_TGR192=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_ANUBIS=m +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SERPENT=m +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +# CONFIG_CRYPTO_ANSI_CPRNG is not set +# CONFIG_CRYPTO_HW is not set diff --git a/arch/mips/configs/sead3_defconfig b/arch/mips/configs/sead3_defconfig index e3eec68d9132..0abe681c11a0 100644 --- a/arch/mips/configs/sead3_defconfig +++ b/arch/mips/configs/sead3_defconfig @@ -2,7 +2,6 @@ CONFIG_MIPS_SEAD3=y CONFIG_CPU_LITTLE_ENDIAN=y CONFIG_CPU_MIPS32_R2=y CONFIG_HZ_100=y -CONFIG_EXPERIMENTAL=y CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y CONFIG_NO_HZ=y @@ -115,10 +114,8 @@ CONFIG_NLS_ISO8859_1=y CONFIG_NLS_ISO8859_15=y CONFIG_NLS_UTF8=y # CONFIG_FTRACE is not set -CONFIG_CRYPTO=y CONFIG_CRYPTO_CBC=y CONFIG_CRYPTO_ECB=y -CONFIG_CRYPTO_AES=y CONFIG_CRYPTO_ARC4=y # CONFIG_CRYPTO_ANSI_CPRNG is not set # CONFIG_CRYPTO_HW is not set diff --git a/arch/mips/configs/sead3micro_defconfig b/arch/mips/configs/sead3micro_defconfig new file mode 100644 index 000000000000..2a0da5bf4b64 --- /dev/null +++ b/arch/mips/configs/sead3micro_defconfig @@ -0,0 +1,122 @@ +CONFIG_MIPS_SEAD3=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_CPU_MIPS32_R2=y +CONFIG_CPU_MICROMIPS=y +CONFIG_HZ_100=y +CONFIG_SYSVIPC=y +CONFIG_POSIX_MQUEUE=y +CONFIG_NO_HZ=y +CONFIG_HIGH_RES_TIMERS=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_LOG_BUF_SHIFT=15 +CONFIG_EMBEDDED=y +CONFIG_SLAB=y +CONFIG_PROFILING=y +CONFIG_OPROFILE=y +CONFIG_MODULES=y +# CONFIG_BLK_DEV_BSG is not set +# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set +CONFIG_NET=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_INET=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +# CONFIG_INET_XFRM_MODE_TRANSPORT is not set +# CONFIG_INET_XFRM_MODE_TUNNEL is not set +# CONFIG_INET_XFRM_MODE_BEET is not set +# CONFIG_INET_LRO is not set +# CONFIG_INET_DIAG is not set +# CONFIG_IPV6 is not set +# CONFIG_WIRELESS is not set +CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" +CONFIG_MTD=y +CONFIG_MTD_CHAR=y +CONFIG_MTD_BLOCK=y +CONFIG_MTD_CFI=y +CONFIG_MTD_CFI_INTELEXT=y +CONFIG_MTD_PHYSMAP=y +CONFIG_MTD_UBI=y +CONFIG_MTD_UBI_GLUEBI=y +CONFIG_BLK_DEV_LOOP=y +CONFIG_BLK_DEV_CRYPTOLOOP=m +CONFIG_SCSI=y +# CONFIG_SCSI_PROC_FS is not set +CONFIG_BLK_DEV_SD=y +CONFIG_CHR_DEV_SG=y +# CONFIG_SCSI_LOWLEVEL is not set +CONFIG_NETDEVICES=y +CONFIG_SMSC911X=y +# CONFIG_NET_VENDOR_WIZNET is not set +CONFIG_MARVELL_PHY=y +CONFIG_DAVICOM_PHY=y +CONFIG_QSEMI_PHY=y +CONFIG_LXT_PHY=y +CONFIG_CICADA_PHY=y +CONFIG_VITESSE_PHY=y +CONFIG_SMSC_PHY=y +CONFIG_BROADCOM_PHY=y +CONFIG_ICPLUS_PHY=y +# CONFIG_WLAN is not set +# CONFIG_INPUT_MOUSEDEV is not set +# CONFIG_INPUT_KEYBOARD is not set +# CONFIG_INPUT_MOUSE is not set +# CONFIG_SERIO is not set +# CONFIG_CONSOLE_TRANSLATIONS is not set +CONFIG_VT_HW_CONSOLE_BINDING=y +CONFIG_LEGACY_PTY_COUNT=32 +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_NR_UARTS=2 +CONFIG_SERIAL_8250_RUNTIME_UARTS=2 +# CONFIG_HW_RANDOM is not set +CONFIG_I2C=y +# CONFIG_I2C_COMPAT is not set +CONFIG_I2C_CHARDEV=y +# CONFIG_I2C_HELPER_AUTO is not set +CONFIG_SPI=y +CONFIG_SENSORS_ADT7475=y +CONFIG_BACKLIGHT_LCD_SUPPORT=y +CONFIG_LCD_CLASS_DEVICE=y +CONFIG_BACKLIGHT_CLASS_DEVICE=y +# CONFIG_VGA_CONSOLE is not set +CONFIG_USB=y +CONFIG_USB_ANNOUNCE_NEW_DEVICES=y +CONFIG_USB_EHCI_HCD=y +CONFIG_USB_EHCI_ROOT_HUB_TT=y +CONFIG_USB_STORAGE=y +CONFIG_MMC=y +CONFIG_MMC_DEBUG=y +CONFIG_MMC_SPI=y +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=y +CONFIG_RTC_CLASS=y +CONFIG_RTC_DRV_M41T80=y +CONFIG_EXT3_FS=y +# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set +CONFIG_XFS_FS=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_QUOTA=y +# CONFIG_PRINT_QUOTA_WARNING is not set +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_TMPFS=y +CONFIG_JFFS2_FS=y +CONFIG_NFS_FS=y +CONFIG_ROOT_NFS=y +CONFIG_NLS_CODEPAGE_437=y +CONFIG_NLS_ASCII=y +CONFIG_NLS_ISO8859_1=y +CONFIG_NLS_ISO8859_15=y +CONFIG_NLS_UTF8=y +# CONFIG_FTRACE is not set +CONFIG_CRYPTO_CBC=y +CONFIG_CRYPTO_ECB=y +CONFIG_CRYPTO_ARC4=y +# CONFIG_CRYPTO_ANSI_CPRNG is not set +# CONFIG_CRYPTO_HW is not set diff --git a/arch/mips/fw/lib/Makefile b/arch/mips/fw/lib/Makefile index 84befc968fc4..529150516777 100644 --- a/arch/mips/fw/lib/Makefile +++ b/arch/mips/fw/lib/Makefile @@ -2,4 +2,6 @@ # Makefile for generic prom monitor library routines under Linux. # +lib-y += cmdline.o + lib-$(CONFIG_64BIT) += call_o32.o diff --git a/arch/mips/fw/lib/cmdline.c b/arch/mips/fw/lib/cmdline.c new file mode 100644 index 000000000000..ffd0345780ae --- /dev/null +++ b/arch/mips/fw/lib/cmdline.c @@ -0,0 +1,101 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. + */ +#include +#include +#include + +#include +#include + +int fw_argc; +int *_fw_argv; +int *_fw_envp; + +void __init fw_init_cmdline(void) +{ + int i; + + /* Validate command line parameters. */ + if ((fw_arg0 >= CKSEG0) || (fw_arg1 < CKSEG0)) { + fw_argc = 0; + _fw_argv = NULL; + } else { + fw_argc = (fw_arg0 & 0x0000ffff); + _fw_argv = (int *)fw_arg1; + } + + /* Validate environment pointer. */ + if (fw_arg2 < CKSEG0) + _fw_envp = NULL; + else + _fw_envp = (int *)fw_arg2; + + for (i = 1; i < fw_argc; i++) { + strlcat(arcs_cmdline, fw_argv(i), COMMAND_LINE_SIZE); + if (i < (fw_argc - 1)) + strlcat(arcs_cmdline, " ", COMMAND_LINE_SIZE); + } +} + +char * __init fw_getcmdline(void) +{ + return &(arcs_cmdline[0]); +} + +char *fw_getenv(char *envname) +{ + char *result = NULL; + + if (_fw_envp != NULL) { + /* + * Return a pointer to the given environment variable. + * YAMON uses "name", "value" pairs, while U-Boot uses + * "name=value". + */ + int i, yamon, index = 0; + + yamon = (strchr(fw_envp(index), '=') == NULL); + i = strlen(envname); + + while (fw_envp(index)) { + if (strncmp(envname, fw_envp(index), i) == 0) { + if (yamon) { + result = fw_envp(index + 1); + break; + } else if (fw_envp(index)[i] == '=') { + result = (fw_envp(index + 1) + i); + break; + } + } + + /* Increment array index. */ + if (yamon) + index += 2; + else + index += 1; + } + } + + return result; +} + +unsigned long fw_getenvl(char *envname) +{ + unsigned long envl = 0UL; + char *str; + long val; + int tmp; + + str = fw_getenv(envname); + if (str) { + tmp = kstrtol(str, 0, &val); + envl = (unsigned long)val; + } + + return envl; +} diff --git a/arch/mips/include/asm/asm.h b/arch/mips/include/asm/asm.h index 164a21e65b42..879691d194af 100644 --- a/arch/mips/include/asm/asm.h +++ b/arch/mips/include/asm/asm.h @@ -296,6 +296,7 @@ symbol = value #define LONG_SUBU subu #define LONG_L lw #define LONG_S sw +#define LONG_SP swp #define LONG_SLL sll #define LONG_SLLV sllv #define LONG_SRL srl @@ -318,6 +319,7 @@ symbol = value #define LONG_SUBU dsubu #define LONG_L ld #define LONG_S sd +#define LONG_SP sdp #define LONG_SLL dsll #define LONG_SLLV dsllv #define LONG_SRL dsrl diff --git a/arch/mips/include/asm/bootinfo.h b/arch/mips/include/asm/bootinfo.h index b71dd5b16085..4d2cdea5aa37 100644 --- a/arch/mips/include/asm/bootinfo.h +++ b/arch/mips/include/asm/bootinfo.h @@ -104,6 +104,7 @@ struct boot_mem_map { extern struct boot_mem_map boot_mem_map; extern void add_memory_region(phys_t start, phys_t size, long type); +extern void detect_memory_region(phys_t start, phys_t sz_min, phys_t sz_max); extern void prom_init(void); extern void prom_free_prom_memory(void); diff --git a/arch/mips/include/asm/branch.h b/arch/mips/include/asm/branch.h index 888766ae1f85..e28a3e0eb3cb 100644 --- a/arch/mips/include/asm/branch.h +++ b/arch/mips/include/asm/branch.h @@ -11,6 +11,14 @@ #include #include +extern int __isa_exception_epc(struct pt_regs *regs); +extern int __compute_return_epc(struct pt_regs *regs); +extern int __compute_return_epc_for_insn(struct pt_regs *regs, + union mips_instruction insn); +extern int __microMIPS_compute_return_epc(struct pt_regs *regs); +extern int __MIPS16e_compute_return_epc(struct pt_regs *regs); + + static inline int delay_slot(struct pt_regs *regs) { return regs->cp0_cause & CAUSEF_BD; @@ -18,20 +26,27 @@ static inline int delay_slot(struct pt_regs *regs) static inline unsigned long exception_epc(struct pt_regs *regs) { - if (!delay_slot(regs)) + if (likely(!delay_slot(regs))) return regs->cp0_epc; + if (get_isa16_mode(regs->cp0_epc)) + return __isa_exception_epc(regs); + return regs->cp0_epc + 4; } #define BRANCH_LIKELY_TAKEN 0x0001 -extern int __compute_return_epc(struct pt_regs *regs); -extern int __compute_return_epc_for_insn(struct pt_regs *regs, - union mips_instruction insn); - static inline int compute_return_epc(struct pt_regs *regs) { + if (get_isa16_mode(regs->cp0_epc)) { + if (cpu_has_mmips) + return __microMIPS_compute_return_epc(regs); + if (cpu_has_mips16) + return __MIPS16e_compute_return_epc(regs); + return regs->cp0_epc; + } + if (!delay_slot(regs)) { regs->cp0_epc += 4; return 0; @@ -40,4 +55,19 @@ static inline int compute_return_epc(struct pt_regs *regs) return __compute_return_epc(regs); } +static inline int MIPS16e_compute_return_epc(struct pt_regs *regs, + union mips16e_instruction *inst) +{ + if (likely(!delay_slot(regs))) { + if (inst->ri.opcode == MIPS16e_extend_op) { + regs->cp0_epc += 4; + return 0; + } + regs->cp0_epc += 2; + return 0; + } + + return __MIPS16e_compute_return_epc(regs); +} + #endif /* _ASM_BRANCH_H */ diff --git a/arch/mips/include/asm/cpu-features.h b/arch/mips/include/asm/cpu-features.h index 1a57e8b4d092..e5ec8fcd8afa 100644 --- a/arch/mips/include/asm/cpu-features.h +++ b/arch/mips/include/asm/cpu-features.h @@ -113,6 +113,9 @@ #ifndef cpu_has_pindexed_dcache #define cpu_has_pindexed_dcache (cpu_data[0].dcache.flags & MIPS_CACHE_PINDEX) #endif +#ifndef cpu_has_local_ebase +#define cpu_has_local_ebase 1 +#endif /* * I-Cache snoops remote store. This only matters on SMP. Some multiprocessors diff --git a/arch/mips/include/asm/dma-coherence.h b/arch/mips/include/asm/dma-coherence.h new file mode 100644 index 000000000000..242cbb3ca582 --- /dev/null +++ b/arch/mips/include/asm/dma-coherence.h @@ -0,0 +1,15 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2006 Ralf Baechle + * + */ +#ifndef __ASM_DMA_COHERENCE_H +#define __ASM_DMA_COHERENCE_H + +extern int coherentio; +extern int hw_coherentio; + +#endif diff --git a/arch/mips/include/asm/dma-mapping.h b/arch/mips/include/asm/dma-mapping.h index f8fc74b6cb47..84238c574d5e 100644 --- a/arch/mips/include/asm/dma-mapping.h +++ b/arch/mips/include/asm/dma-mapping.h @@ -2,6 +2,7 @@ #define _ASM_DMA_MAPPING_H #include +#include #include #include diff --git a/arch/mips/include/asm/fpu_emulator.h b/arch/mips/include/asm/fpu_emulator.h index 3b4092705567..2abb587d5ab4 100644 --- a/arch/mips/include/asm/fpu_emulator.h +++ b/arch/mips/include/asm/fpu_emulator.h @@ -54,6 +54,12 @@ do { \ extern int mips_dsemul(struct pt_regs *regs, mips_instruction ir, unsigned long cpc); extern int do_dsemulret(struct pt_regs *xcp); +extern int fpu_emulator_cop1Handler(struct pt_regs *xcp, + struct mips_fpu_struct *ctx, int has_fpu, + void *__user *fault_addr); +int process_fpemu_return(int sig, void __user *fault_addr); +int mm_isBranchInstr(struct pt_regs *regs, struct mm_decoded_insn dec_insn, + unsigned long *contpc); /* * Instruction inserted following the badinst to further tag the sequence diff --git a/arch/mips/include/asm/fw/fw.h b/arch/mips/include/asm/fw/fw.h new file mode 100644 index 000000000000..d6c50a7e9ede --- /dev/null +++ b/arch/mips/include/asm/fw/fw.h @@ -0,0 +1,47 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2012 MIPS Technologies, Inc. + */ +#ifndef __ASM_FW_H_ +#define __ASM_FW_H_ + +#include /* For cleaner code... */ + +enum fw_memtypes { + fw_dontuse, + fw_code, + fw_free, +}; + +typedef struct { + unsigned long base; /* Within KSEG0 */ + unsigned int size; /* bytes */ + enum fw_memtypes type; /* fw_memtypes */ +} fw_memblock_t; + +/* Maximum number of memory block descriptors. */ +#define FW_MAX_MEMBLOCKS 32 + +extern int fw_argc; +extern int *_fw_argv; +extern int *_fw_envp; + +/* + * Most firmware like YAMON, PMON, etc. pass arguments and environment + * variables as 32-bit pointers. These take care of sign extension. + */ +#define fw_argv(index) ((char *)(long)_fw_argv[(index)]) +#define fw_envp(index) ((char *)(long)_fw_envp[(index)]) + +extern void fw_init_cmdline(void); +extern char *fw_getcmdline(void); +extern fw_memblock_t *fw_getmdesc(void); +extern void fw_meminit(void); +extern char *fw_getenv(char *name); +extern unsigned long fw_getenvl(char *name); +extern void fw_init_early_console(char port); + +#endif /* __ASM_FW_H_ */ diff --git a/arch/mips/include/asm/gic.h b/arch/mips/include/asm/gic.h index bdc9786ab5a7..7153b32de18e 100644 --- a/arch/mips/include/asm/gic.h +++ b/arch/mips/include/asm/gic.h @@ -202,7 +202,7 @@ #define GIC_VPE_WD_COUNT0_OFS 0x0094 #define GIC_VPE_WD_INITIAL0_OFS 0x0098 #define GIC_VPE_COMPARE_LO_OFS 0x00a0 -#define GIC_VPE_COMPARE_HI 0x00a4 +#define GIC_VPE_COMPARE_HI_OFS 0x00a4 #define GIC_VPE_EIC_SHADOW_SET_BASE 0x0100 #define GIC_VPE_EIC_SS(intr) \ @@ -359,7 +359,11 @@ struct gic_shared_intr_map { /* Mapped interrupt to pin X, then GIC will generate the vector (X+1). */ #define GIC_PIN_TO_VEC_OFFSET (1) -extern int gic_present; +#include +#include + +extern unsigned int gic_present; +extern unsigned int gic_frequency; extern unsigned long _gic_base; extern unsigned int gic_irq_base; extern unsigned int gic_irq_flags[]; @@ -368,18 +372,20 @@ extern struct gic_shared_intr_map gic_shared_intr_map[]; extern void gic_init(unsigned long gic_base_addr, unsigned long gic_addrspace_size, struct gic_intr_map *intrmap, unsigned int intrmap_size, unsigned int irqbase); - extern void gic_clocksource_init(unsigned int); -extern unsigned int gic_get_int(void); +extern unsigned int gic_compare_int (void); +extern cycle_t gic_read_count(void); +extern cycle_t gic_read_compare(void); +extern void gic_write_compare(cycle_t cnt); extern void gic_send_ipi(unsigned int intr); extern unsigned int plat_ipi_call_int_xlate(unsigned int); extern unsigned int plat_ipi_resched_int_xlate(unsigned int); extern void gic_bind_eic_interrupt(int irq, int set); extern unsigned int gic_get_timer_pending(void); +extern unsigned int gic_get_int(void); extern void gic_enable_interrupt(int irq_vec); extern void gic_disable_interrupt(int irq_vec); extern void gic_irq_ack(struct irq_data *d); extern void gic_finish_irq(struct irq_data *d); extern void gic_platform_init(int irqs, struct irq_chip *irq_controller); - #endif /* _ASM_GICREGS_H */ diff --git a/arch/mips/include/asm/hazards.h b/arch/mips/include/asm/hazards.h index 44d6a5bde4a1..e3ee92d4dbe7 100644 --- a/arch/mips/include/asm/hazards.h +++ b/arch/mips/include/asm/hazards.h @@ -10,34 +10,13 @@ #ifndef _ASM_HAZARDS_H #define _ASM_HAZARDS_H -#ifdef __ASSEMBLY__ -#define ASMMACRO(name, code...) .macro name; code; .endm -#else - -#include - -#define ASMMACRO(name, code...) \ -__asm__(".macro " #name "; " #code "; .endm"); \ - \ -static inline void name(void) \ -{ \ - __asm__ __volatile__ (#name); \ -} - -/* - * MIPS R2 instruction hazard barrier. Needs to be called as a subroutine. - */ -extern void mips_ihb(void); - -#endif +#include -ASMMACRO(_ssnop, - sll $0, $0, 1 - ) +#define ___ssnop \ + sll $0, $0, 1 -ASMMACRO(_ehb, - sll $0, $0, 3 - ) +#define ___ehb \ + sll $0, $0, 3 /* * TLB hazards @@ -48,24 +27,24 @@ ASMMACRO(_ehb, * MIPSR2 defines ehb for hazard avoidance */ -ASMMACRO(mtc0_tlbw_hazard, - _ehb - ) -ASMMACRO(tlbw_use_hazard, - _ehb - ) -ASMMACRO(tlb_probe_hazard, - _ehb - ) -ASMMACRO(irq_enable_hazard, - _ehb - ) -ASMMACRO(irq_disable_hazard, - _ehb - ) -ASMMACRO(back_to_back_c0_hazard, - _ehb - ) +#define __mtc0_tlbw_hazard \ + ___ehb + +#define __tlbw_use_hazard \ + ___ehb + +#define __tlb_probe_hazard \ + ___ehb + +#define __irq_enable_hazard \ + ___ehb + +#define __irq_disable_hazard \ + ___ehb + +#define __back_to_back_c0_hazard \ + ___ehb + /* * gcc has a tradition of misscompiling the previous construct using the * address of a label as argument to inline assembler. Gas otoh has the @@ -94,24 +73,42 @@ do { \ * These are slightly complicated by the fact that we guarantee R1 kernels to * run fine on R2 processors. */ -ASMMACRO(mtc0_tlbw_hazard, - _ssnop; _ssnop; _ehb - ) -ASMMACRO(tlbw_use_hazard, - _ssnop; _ssnop; _ssnop; _ehb - ) -ASMMACRO(tlb_probe_hazard, - _ssnop; _ssnop; _ssnop; _ehb - ) -ASMMACRO(irq_enable_hazard, - _ssnop; _ssnop; _ssnop; _ehb - ) -ASMMACRO(irq_disable_hazard, - _ssnop; _ssnop; _ssnop; _ehb - ) -ASMMACRO(back_to_back_c0_hazard, - _ssnop; _ssnop; _ssnop; _ehb - ) + +#define __mtc0_tlbw_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ehb + +#define __tlbw_use_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ssnop; \ + ___ehb + +#define __tlb_probe_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ssnop; \ + ___ehb + +#define __irq_enable_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ssnop; \ + ___ehb + +#define __irq_disable_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ssnop; \ + ___ehb + +#define __back_to_back_c0_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ssnop; \ + ___ehb + /* * gcc has a tradition of misscompiling the previous construct using the * address of a label as argument to inline assembler. Gas otoh has the @@ -147,18 +144,18 @@ do { \ * R10000 rocks - all hazards handled in hardware, so this becomes a nobrainer. */ -ASMMACRO(mtc0_tlbw_hazard, - ) -ASMMACRO(tlbw_use_hazard, - ) -ASMMACRO(tlb_probe_hazard, - ) -ASMMACRO(irq_enable_hazard, - ) -ASMMACRO(irq_disable_hazard, - ) -ASMMACRO(back_to_back_c0_hazard, - ) +#define __mtc0_tlbw_hazard + +#define __tlbw_use_hazard + +#define __tlb_probe_hazard + +#define __irq_enable_hazard + +#define __irq_disable_hazard + +#define __back_to_back_c0_hazard + #define instruction_hazard() do { } while (0) #elif defined(CONFIG_CPU_SB1) @@ -166,19 +163,21 @@ ASMMACRO(back_to_back_c0_hazard, /* * Mostly like R4000 for historic reasons */ -ASMMACRO(mtc0_tlbw_hazard, - ) -ASMMACRO(tlbw_use_hazard, - ) -ASMMACRO(tlb_probe_hazard, - ) -ASMMACRO(irq_enable_hazard, - ) -ASMMACRO(irq_disable_hazard, - _ssnop; _ssnop; _ssnop - ) -ASMMACRO(back_to_back_c0_hazard, - ) +#define __mtc0_tlbw_hazard + +#define __tlbw_use_hazard + +#define __tlb_probe_hazard + +#define __irq_enable_hazard + +#define __irq_disable_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ssnop + +#define __back_to_back_c0_hazard + #define instruction_hazard() do { } while (0) #else @@ -192,24 +191,35 @@ ASMMACRO(back_to_back_c0_hazard, * hazard so this is nice trick to have an optimal code for a range of * processors. */ -ASMMACRO(mtc0_tlbw_hazard, - nop; nop - ) -ASMMACRO(tlbw_use_hazard, - nop; nop; nop - ) -ASMMACRO(tlb_probe_hazard, - nop; nop; nop - ) -ASMMACRO(irq_enable_hazard, - _ssnop; _ssnop; _ssnop; - ) -ASMMACRO(irq_disable_hazard, - nop; nop; nop - ) -ASMMACRO(back_to_back_c0_hazard, - _ssnop; _ssnop; _ssnop; - ) +#define __mtc0_tlbw_hazard \ + nop; \ + nop + +#define __tlbw_use_hazard \ + nop; \ + nop; \ + nop + +#define __tlb_probe_hazard \ + nop; \ + nop; \ + nop + +#define __irq_enable_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ssnop + +#define __irq_disable_hazard \ + nop; \ + nop; \ + nop + +#define __back_to_back_c0_hazard \ + ___ssnop; \ + ___ssnop; \ + ___ssnop + #define instruction_hazard() do { } while (0) #endif @@ -218,32 +228,137 @@ ASMMACRO(back_to_back_c0_hazard, /* FPU hazards */ #if defined(CONFIG_CPU_SB1) -ASMMACRO(enable_fpu_hazard, - .set push; - .set mips64; - .set noreorder; - _ssnop; - bnezl $0, .+4; - _ssnop; - .set pop -) -ASMMACRO(disable_fpu_hazard, -) + +#define __enable_fpu_hazard \ + .set push; \ + .set mips64; \ + .set noreorder; \ + ___ssnop; \ + bnezl $0, .+4; \ + ___ssnop; \ + .set pop + +#define __disable_fpu_hazard #elif defined(CONFIG_CPU_MIPSR2) -ASMMACRO(enable_fpu_hazard, - _ehb -) -ASMMACRO(disable_fpu_hazard, - _ehb -) + +#define __enable_fpu_hazard \ + ___ehb + +#define __disable_fpu_hazard \ + ___ehb + #else -ASMMACRO(enable_fpu_hazard, - nop; nop; nop; nop -) -ASMMACRO(disable_fpu_hazard, - _ehb -) + +#define __enable_fpu_hazard \ + nop; \ + nop; \ + nop; \ + nop + +#define __disable_fpu_hazard \ + ___ehb + #endif +#ifdef __ASSEMBLY__ + +#define _ssnop ___ssnop +#define _ehb ___ehb +#define mtc0_tlbw_hazard __mtc0_tlbw_hazard +#define tlbw_use_hazard __tlbw_use_hazard +#define tlb_probe_hazard __tlb_probe_hazard +#define irq_enable_hazard __irq_enable_hazard +#define irq_disable_hazard __irq_disable_hazard +#define back_to_back_c0_hazard __back_to_back_c0_hazard +#define enable_fpu_hazard __enable_fpu_hazard +#define disable_fpu_hazard __disable_fpu_hazard + +#else + +#define _ssnop() \ +do { \ + __asm__ __volatile__( \ + __stringify(___ssnop) \ + ); \ +} while (0) + +#define _ehb() \ +do { \ + __asm__ __volatile__( \ + __stringify(___ehb) \ + ); \ +} while (0) + + +#define mtc0_tlbw_hazard() \ +do { \ + __asm__ __volatile__( \ + __stringify(__mtc0_tlbw_hazard) \ + ); \ +} while (0) + + +#define tlbw_use_hazard() \ +do { \ + __asm__ __volatile__( \ + __stringify(__tlbw_use_hazard) \ + ); \ +} while (0) + + +#define tlb_probe_hazard() \ +do { \ + __asm__ __volatile__( \ + __stringify(__tlb_probe_hazard) \ + ); \ +} while (0) + + +#define irq_enable_hazard() \ +do { \ + __asm__ __volatile__( \ + __stringify(__irq_enable_hazard) \ + ); \ +} while (0) + + +#define irq_disable_hazard() \ +do { \ + __asm__ __volatile__( \ + __stringify(__irq_disable_hazard) \ + ); \ +} while (0) + + +#define back_to_back_c0_hazard() \ +do { \ + __asm__ __volatile__( \ + __stringify(__back_to_back_c0_hazard) \ + ); \ +} while (0) + + +#define enable_fpu_hazard() \ +do { \ + __asm__ __volatile__( \ + __stringify(__enable_fpu_hazard) \ + ); \ +} while (0) + + +#define disable_fpu_hazard() \ +do { \ + __asm__ __volatile__( \ + __stringify(__disable_fpu_hazard) \ + ); \ +} while (0) + +/* + * MIPS R2 instruction hazard barrier. Needs to be called as a subroutine. + */ +extern void mips_ihb(void); + +#endif /* __ASSEMBLY__ */ + #endif /* _ASM_HAZARDS_H */ diff --git a/arch/mips/include/asm/inst.h b/arch/mips/include/asm/inst.h index f1eadf764071..22912f78401c 100644 --- a/arch/mips/include/asm/inst.h +++ b/arch/mips/include/asm/inst.h @@ -73,4 +73,16 @@ typedef unsigned int mips_instruction; +/* microMIPS instruction decode structure. Do NOT export!!! */ +struct mm_decoded_insn { + mips_instruction insn; + mips_instruction next_insn; + int pc_inc; + int next_pc_inc; + int micro_mips_mode; +}; + +/* Recode table from 16-bit register notation to 32-bit GPR. Do NOT export!!! */ +extern const int reg16to32[]; + #endif /* _ASM_INST_H */ diff --git a/arch/mips/include/asm/irqflags.h b/arch/mips/include/asm/irqflags.h index 9f3384c789d7..45c00951888b 100644 --- a/arch/mips/include/asm/irqflags.h +++ b/arch/mips/include/asm/irqflags.h @@ -14,53 +14,48 @@ #ifndef __ASSEMBLY__ #include +#include #include #if defined(CONFIG_CPU_MIPSR2) && !defined(CONFIG_MIPS_MT_SMTC) -__asm__( - " .macro arch_local_irq_disable\n" +static inline void arch_local_irq_disable(void) +{ + __asm__ __volatile__( " .set push \n" " .set noat \n" " di \n" - " irq_disable_hazard \n" + " " __stringify(__irq_disable_hazard) " \n" " .set pop \n" - " .endm \n"); - -static inline void arch_local_irq_disable(void) -{ - __asm__ __volatile__( - "arch_local_irq_disable" - : /* no outputs */ - : /* no inputs */ - : "memory"); + : /* no outputs */ + : /* no inputs */ + : "memory"); } +static inline unsigned long arch_local_irq_save(void) +{ + unsigned long flags; -__asm__( - " .macro arch_local_irq_save result \n" + asm __volatile__( " .set push \n" " .set reorder \n" " .set noat \n" - " di \\result \n" - " andi \\result, 1 \n" - " irq_disable_hazard \n" + " di %[flags] \n" + " andi %[flags], 1 \n" + " " __stringify(__irq_disable_hazard) " \n" " .set pop \n" - " .endm \n"); + : [flags] "=r" (flags) + : /* no inputs */ + : "memory"); -static inline unsigned long arch_local_irq_save(void) -{ - unsigned long flags; - asm volatile("arch_local_irq_save\t%0" - : "=r" (flags) - : /* no inputs */ - : "memory"); return flags; } +static inline void arch_local_irq_restore(unsigned long flags) +{ + unsigned long __tmp1; -__asm__( - " .macro arch_local_irq_restore flags \n" + __asm__ __volatile__( " .set push \n" " .set noreorder \n" " .set noat \n" @@ -69,7 +64,7 @@ __asm__( * Slow, but doesn't suffer from a relatively unlikely race * condition we're having since days 1. */ - " beqz \\flags, 1f \n" + " beqz %[flags], 1f \n" " di \n" " ei \n" "1: \n" @@ -78,33 +73,44 @@ __asm__( * Fast, dangerous. Life is fun, life is good. */ " mfc0 $1, $12 \n" - " ins $1, \\flags, 0, 1 \n" + " ins $1, %[flags], 0, 1 \n" " mtc0 $1, $12 \n" #endif - " irq_disable_hazard \n" + " " __stringify(__irq_disable_hazard) " \n" " .set pop \n" - " .endm \n"); - -static inline void arch_local_irq_restore(unsigned long flags) -{ - unsigned long __tmp1; - - __asm__ __volatile__( - "arch_local_irq_restore\t%0" - : "=r" (__tmp1) - : "0" (flags) - : "memory"); + : [flags] "=r" (__tmp1) + : "0" (flags) + : "memory"); } static inline void __arch_local_irq_restore(unsigned long flags) { - unsigned long __tmp1; - __asm__ __volatile__( - "arch_local_irq_restore\t%0" - : "=r" (__tmp1) - : "0" (flags) - : "memory"); + " .set push \n" + " .set noreorder \n" + " .set noat \n" +#if defined(CONFIG_IRQ_CPU) + /* + * Slow, but doesn't suffer from a relatively unlikely race + * condition we're having since days 1. + */ + " beqz %[flags], 1f \n" + " di \n" + " ei \n" + "1: \n" +#else + /* + * Fast, dangerous. Life is fun, life is good. + */ + " mfc0 $1, $12 \n" + " ins $1, %[flags], 0, 1 \n" + " mtc0 $1, $12 \n" +#endif + " " __stringify(__irq_disable_hazard) " \n" + " .set pop \n" + : [flags] "=r" (flags) + : "0" (flags) + : "memory"); } #else /* Functions that require preempt_{dis,en}able() are in mips-atomic.c */ @@ -115,8 +121,18 @@ void __arch_local_irq_restore(unsigned long flags); #endif /* if defined(CONFIG_CPU_MIPSR2) && !defined(CONFIG_MIPS_MT_SMTC) */ -__asm__( - " .macro arch_local_irq_enable \n" +extern void smtc_ipi_replay(void); + +static inline void arch_local_irq_enable(void) +{ +#ifdef CONFIG_MIPS_MT_SMTC + /* + * SMTC kernel needs to do a software replay of queued + * IPIs, at the cost of call overhead on each local_irq_enable() + */ + smtc_ipi_replay(); +#endif + __asm__ __volatile__( " .set push \n" " .set reorder \n" " .set noat \n" @@ -133,45 +149,28 @@ __asm__( " xori $1,0x1e \n" " mtc0 $1,$12 \n" #endif - " irq_enable_hazard \n" + " " __stringify(__irq_enable_hazard) " \n" " .set pop \n" - " .endm"); - -extern void smtc_ipi_replay(void); - -static inline void arch_local_irq_enable(void) -{ -#ifdef CONFIG_MIPS_MT_SMTC - /* - * SMTC kernel needs to do a software replay of queued - * IPIs, at the cost of call overhead on each local_irq_enable() - */ - smtc_ipi_replay(); -#endif - __asm__ __volatile__( - "arch_local_irq_enable" - : /* no outputs */ - : /* no inputs */ - : "memory"); + : /* no outputs */ + : /* no inputs */ + : "memory"); } +static inline unsigned long arch_local_save_flags(void) +{ + unsigned long flags; -__asm__( - " .macro arch_local_save_flags flags \n" + asm __volatile__( " .set push \n" " .set reorder \n" #ifdef CONFIG_MIPS_MT_SMTC - " mfc0 \\flags, $2, 1 \n" + " mfc0 %[flags], $2, 1 \n" #else - " mfc0 \\flags, $12 \n" + " mfc0 %[flags], $12 \n" #endif " .set pop \n" - " .endm \n"); + : [flags] "=r" (flags)); -static inline unsigned long arch_local_save_flags(void) -{ - unsigned long flags; - asm volatile("arch_local_save_flags %0" : "=r" (flags)); return flags; } diff --git a/arch/mips/include/asm/kvm.h b/arch/mips/include/asm/kvm.h new file mode 100644 index 000000000000..85789eacbf18 --- /dev/null +++ b/arch/mips/include/asm/kvm.h @@ -0,0 +1,55 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#ifndef __LINUX_KVM_MIPS_H +#define __LINUX_KVM_MIPS_H + +#include + +#define __KVM_MIPS + +#define N_MIPS_COPROC_REGS 32 +#define N_MIPS_COPROC_SEL 8 + +/* for KVM_GET_REGS and KVM_SET_REGS */ +struct kvm_regs { + __u32 gprs[32]; + __u32 hi; + __u32 lo; + __u32 pc; + + __u32 cp0reg[N_MIPS_COPROC_REGS][N_MIPS_COPROC_SEL]; +}; + +/* for KVM_GET_SREGS and KVM_SET_SREGS */ +struct kvm_sregs { +}; + +/* for KVM_GET_FPU and KVM_SET_FPU */ +struct kvm_fpu { +}; + +struct kvm_debug_exit_arch { +}; + +/* for KVM_SET_GUEST_DEBUG */ +struct kvm_guest_debug_arch { +}; + +struct kvm_mips_interrupt { + /* in */ + __u32 cpu; + __u32 irq; +}; + +/* definition of registers in kvm_run */ +struct kvm_sync_regs { +}; + +#endif /* __LINUX_KVM_MIPS_H */ diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h new file mode 100644 index 000000000000..e68781e18387 --- /dev/null +++ b/arch/mips/include/asm/kvm_host.h @@ -0,0 +1,667 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#ifndef __MIPS_KVM_HOST_H__ +#define __MIPS_KVM_HOST_H__ + +#include +#include +#include +#include +#include +#include +#include +#include + + +#define KVM_MAX_VCPUS 1 +#define KVM_USER_MEM_SLOTS 8 +/* memory slots that does not exposed to userspace */ +#define KVM_PRIVATE_MEM_SLOTS 0 + +#define KVM_COALESCED_MMIO_PAGE_OFFSET 1 + +/* Don't support huge pages */ +#define KVM_HPAGE_GFN_SHIFT(x) 0 + +/* We don't currently support large pages. */ +#define KVM_NR_PAGE_SIZES 1 +#define KVM_PAGES_PER_HPAGE(x) 1 + + + +/* Special address that contains the comm page, used for reducing # of traps */ +#define KVM_GUEST_COMMPAGE_ADDR 0x0 + +#define KVM_GUEST_KERNEL_MODE(vcpu) ((kvm_read_c0_guest_status(vcpu->arch.cop0) & (ST0_EXL | ST0_ERL)) || \ + ((kvm_read_c0_guest_status(vcpu->arch.cop0) & KSU_USER) == 0)) + +#define KVM_GUEST_KUSEG 0x00000000UL +#define KVM_GUEST_KSEG0 0x40000000UL +#define KVM_GUEST_KSEG23 0x60000000UL +#define KVM_GUEST_KSEGX(a) ((_ACAST32_(a)) & 0x60000000) +#define KVM_GUEST_CPHYSADDR(a) ((_ACAST32_(a)) & 0x1fffffff) + +#define KVM_GUEST_CKSEG0ADDR(a) (KVM_GUEST_CPHYSADDR(a) | KVM_GUEST_KSEG0) +#define KVM_GUEST_CKSEG1ADDR(a) (KVM_GUEST_CPHYSADDR(a) | KVM_GUEST_KSEG1) +#define KVM_GUEST_CKSEG23ADDR(a) (KVM_GUEST_CPHYSADDR(a) | KVM_GUEST_KSEG23) + +/* + * Map an address to a certain kernel segment + */ +#define KVM_GUEST_KSEG0ADDR(a) (KVM_GUEST_CPHYSADDR(a) | KVM_GUEST_KSEG0) +#define KVM_GUEST_KSEG1ADDR(a) (KVM_GUEST_CPHYSADDR(a) | KVM_GUEST_KSEG1) +#define KVM_GUEST_KSEG23ADDR(a) (KVM_GUEST_CPHYSADDR(a) | KVM_GUEST_KSEG23) + +#define KVM_INVALID_PAGE 0xdeadbeef +#define KVM_INVALID_INST 0xdeadbeef +#define KVM_INVALID_ADDR 0xdeadbeef + +#define KVM_MALTA_GUEST_RTC_ADDR 0xb8000070UL + +#define GUEST_TICKS_PER_JIFFY (40000000/HZ) +#define MS_TO_NS(x) (x * 1E6L) + +#define CAUSEB_DC 27 +#define CAUSEF_DC (_ULCAST_(1) << 27) + +struct kvm; +struct kvm_run; +struct kvm_vcpu; +struct kvm_interrupt; + +extern atomic_t kvm_mips_instance; +extern pfn_t(*kvm_mips_gfn_to_pfn) (struct kvm *kvm, gfn_t gfn); +extern void (*kvm_mips_release_pfn_clean) (pfn_t pfn); +extern bool(*kvm_mips_is_error_pfn) (pfn_t pfn); + +struct kvm_vm_stat { + u32 remote_tlb_flush; +}; + +struct kvm_vcpu_stat { + u32 wait_exits; + u32 cache_exits; + u32 signal_exits; + u32 int_exits; + u32 cop_unusable_exits; + u32 tlbmod_exits; + u32 tlbmiss_ld_exits; + u32 tlbmiss_st_exits; + u32 addrerr_st_exits; + u32 addrerr_ld_exits; + u32 syscall_exits; + u32 resvd_inst_exits; + u32 break_inst_exits; + u32 flush_dcache_exits; + u32 halt_wakeup; +}; + +enum kvm_mips_exit_types { + WAIT_EXITS, + CACHE_EXITS, + SIGNAL_EXITS, + INT_EXITS, + COP_UNUSABLE_EXITS, + TLBMOD_EXITS, + TLBMISS_LD_EXITS, + TLBMISS_ST_EXITS, + ADDRERR_ST_EXITS, + ADDRERR_LD_EXITS, + SYSCALL_EXITS, + RESVD_INST_EXITS, + BREAK_INST_EXITS, + FLUSH_DCACHE_EXITS, + MAX_KVM_MIPS_EXIT_TYPES +}; + +struct kvm_arch_memory_slot { +}; + +struct kvm_arch { + /* Guest GVA->HPA page table */ + unsigned long *guest_pmap; + unsigned long guest_pmap_npages; + + /* Wired host TLB used for the commpage */ + int commpage_tlb; +}; + +#define N_MIPS_COPROC_REGS 32 +#define N_MIPS_COPROC_SEL 8 + +struct mips_coproc { + unsigned long reg[N_MIPS_COPROC_REGS][N_MIPS_COPROC_SEL]; +#ifdef CONFIG_KVM_MIPS_DEBUG_COP0_COUNTERS + unsigned long stat[N_MIPS_COPROC_REGS][N_MIPS_COPROC_SEL]; +#endif +}; + +/* + * Coprocessor 0 register names + */ +#define MIPS_CP0_TLB_INDEX 0 +#define MIPS_CP0_TLB_RANDOM 1 +#define MIPS_CP0_TLB_LOW 2 +#define MIPS_CP0_TLB_LO0 2 +#define MIPS_CP0_TLB_LO1 3 +#define MIPS_CP0_TLB_CONTEXT 4 +#define MIPS_CP0_TLB_PG_MASK 5 +#define MIPS_CP0_TLB_WIRED 6 +#define MIPS_CP0_HWRENA 7 +#define MIPS_CP0_BAD_VADDR 8 +#define MIPS_CP0_COUNT 9 +#define MIPS_CP0_TLB_HI 10 +#define MIPS_CP0_COMPARE 11 +#define MIPS_CP0_STATUS 12 +#define MIPS_CP0_CAUSE 13 +#define MIPS_CP0_EXC_PC 14 +#define MIPS_CP0_PRID 15 +#define MIPS_CP0_CONFIG 16 +#define MIPS_CP0_LLADDR 17 +#define MIPS_CP0_WATCH_LO 18 +#define MIPS_CP0_WATCH_HI 19 +#define MIPS_CP0_TLB_XCONTEXT 20 +#define MIPS_CP0_ECC 26 +#define MIPS_CP0_CACHE_ERR 27 +#define MIPS_CP0_TAG_LO 28 +#define MIPS_CP0_TAG_HI 29 +#define MIPS_CP0_ERROR_PC 30 +#define MIPS_CP0_DEBUG 23 +#define MIPS_CP0_DEPC 24 +#define MIPS_CP0_PERFCNT 25 +#define MIPS_CP0_ERRCTL 26 +#define MIPS_CP0_DATA_LO 28 +#define MIPS_CP0_DATA_HI 29 +#define MIPS_CP0_DESAVE 31 + +#define MIPS_CP0_CONFIG_SEL 0 +#define MIPS_CP0_CONFIG1_SEL 1 +#define MIPS_CP0_CONFIG2_SEL 2 +#define MIPS_CP0_CONFIG3_SEL 3 + +/* Config0 register bits */ +#define CP0C0_M 31 +#define CP0C0_K23 28 +#define CP0C0_KU 25 +#define CP0C0_MDU 20 +#define CP0C0_MM 17 +#define CP0C0_BM 16 +#define CP0C0_BE 15 +#define CP0C0_AT 13 +#define CP0C0_AR 10 +#define CP0C0_MT 7 +#define CP0C0_VI 3 +#define CP0C0_K0 0 + +/* Config1 register bits */ +#define CP0C1_M 31 +#define CP0C1_MMU 25 +#define CP0C1_IS 22 +#define CP0C1_IL 19 +#define CP0C1_IA 16 +#define CP0C1_DS 13 +#define CP0C1_DL 10 +#define CP0C1_DA 7 +#define CP0C1_C2 6 +#define CP0C1_MD 5 +#define CP0C1_PC 4 +#define CP0C1_WR 3 +#define CP0C1_CA 2 +#define CP0C1_EP 1 +#define CP0C1_FP 0 + +/* Config2 Register bits */ +#define CP0C2_M 31 +#define CP0C2_TU 28 +#define CP0C2_TS 24 +#define CP0C2_TL 20 +#define CP0C2_TA 16 +#define CP0C2_SU 12 +#define CP0C2_SS 8 +#define CP0C2_SL 4 +#define CP0C2_SA 0 + +/* Config3 Register bits */ +#define CP0C3_M 31 +#define CP0C3_ISA_ON_EXC 16 +#define CP0C3_ULRI 13 +#define CP0C3_DSPP 10 +#define CP0C3_LPA 7 +#define CP0C3_VEIC 6 +#define CP0C3_VInt 5 +#define CP0C3_SP 4 +#define CP0C3_MT 2 +#define CP0C3_SM 1 +#define CP0C3_TL 0 + +/* Have config1, Cacheable, noncoherent, write-back, write allocate*/ +#define MIPS_CONFIG0 \ + ((1 << CP0C0_M) | (0x3 << CP0C0_K0)) + +/* Have config2, no coprocessor2 attached, no MDMX support attached, + no performance counters, watch registers present, + no code compression, EJTAG present, no FPU, no watch registers */ +#define MIPS_CONFIG1 \ +((1 << CP0C1_M) | \ + (0 << CP0C1_C2) | (0 << CP0C1_MD) | (0 << CP0C1_PC) | \ + (0 << CP0C1_WR) | (0 << CP0C1_CA) | (1 << CP0C1_EP) | \ + (0 << CP0C1_FP)) + +/* Have config3, no tertiary/secondary caches implemented */ +#define MIPS_CONFIG2 \ +((1 << CP0C2_M)) + +/* No config4, no DSP ASE, no large physaddr (PABITS), + no external interrupt controller, no vectored interrupts, + no 1kb pages, no SmartMIPS ASE, no trace logic */ +#define MIPS_CONFIG3 \ +((0 << CP0C3_M) | (0 << CP0C3_DSPP) | (0 << CP0C3_LPA) | \ + (0 << CP0C3_VEIC) | (0 << CP0C3_VInt) | (0 << CP0C3_SP) | \ + (0 << CP0C3_SM) | (0 << CP0C3_TL)) + +/* MMU types, the first four entries have the same layout as the + CP0C0_MT field. */ +enum mips_mmu_types { + MMU_TYPE_NONE, + MMU_TYPE_R4000, + MMU_TYPE_RESERVED, + MMU_TYPE_FMT, + MMU_TYPE_R3000, + MMU_TYPE_R6000, + MMU_TYPE_R8000 +}; + +/* + * Trap codes + */ +#define T_INT 0 /* Interrupt pending */ +#define T_TLB_MOD 1 /* TLB modified fault */ +#define T_TLB_LD_MISS 2 /* TLB miss on load or ifetch */ +#define T_TLB_ST_MISS 3 /* TLB miss on a store */ +#define T_ADDR_ERR_LD 4 /* Address error on a load or ifetch */ +#define T_ADDR_ERR_ST 5 /* Address error on a store */ +#define T_BUS_ERR_IFETCH 6 /* Bus error on an ifetch */ +#define T_BUS_ERR_LD_ST 7 /* Bus error on a load or store */ +#define T_SYSCALL 8 /* System call */ +#define T_BREAK 9 /* Breakpoint */ +#define T_RES_INST 10 /* Reserved instruction exception */ +#define T_COP_UNUSABLE 11 /* Coprocessor unusable */ +#define T_OVFLOW 12 /* Arithmetic overflow */ + +/* + * Trap definitions added for r4000 port. + */ +#define T_TRAP 13 /* Trap instruction */ +#define T_VCEI 14 /* Virtual coherency exception */ +#define T_FPE 15 /* Floating point exception */ +#define T_WATCH 23 /* Watch address reference */ +#define T_VCED 31 /* Virtual coherency data */ + +/* Resume Flags */ +#define RESUME_FLAG_DR (1<<0) /* Reload guest nonvolatile state? */ +#define RESUME_FLAG_HOST (1<<1) /* Resume host? */ + +#define RESUME_GUEST 0 +#define RESUME_GUEST_DR RESUME_FLAG_DR +#define RESUME_HOST RESUME_FLAG_HOST + +enum emulation_result { + EMULATE_DONE, /* no further processing */ + EMULATE_DO_MMIO, /* kvm_run filled with MMIO request */ + EMULATE_FAIL, /* can't emulate this instruction */ + EMULATE_WAIT, /* WAIT instruction */ + EMULATE_PRIV_FAIL, +}; + +#define MIPS3_PG_G 0x00000001 /* Global; ignore ASID if in lo0 & lo1 */ +#define MIPS3_PG_V 0x00000002 /* Valid */ +#define MIPS3_PG_NV 0x00000000 +#define MIPS3_PG_D 0x00000004 /* Dirty */ + +#define mips3_paddr_to_tlbpfn(x) \ + (((unsigned long)(x) >> MIPS3_PG_SHIFT) & MIPS3_PG_FRAME) +#define mips3_tlbpfn_to_paddr(x) \ + ((unsigned long)((x) & MIPS3_PG_FRAME) << MIPS3_PG_SHIFT) + +#define MIPS3_PG_SHIFT 6 +#define MIPS3_PG_FRAME 0x3fffffc0 + +#define VPN2_MASK 0xffffe000 +#define TLB_IS_GLOBAL(x) (((x).tlb_lo0 & MIPS3_PG_G) && ((x).tlb_lo1 & MIPS3_PG_G)) +#define TLB_VPN2(x) ((x).tlb_hi & VPN2_MASK) +#define TLB_ASID(x) (ASID_MASK((x).tlb_hi)) +#define TLB_IS_VALID(x, va) (((va) & (1 << PAGE_SHIFT)) ? ((x).tlb_lo1 & MIPS3_PG_V) : ((x).tlb_lo0 & MIPS3_PG_V)) + +struct kvm_mips_tlb { + long tlb_mask; + long tlb_hi; + long tlb_lo0; + long tlb_lo1; +}; + +#define KVM_MIPS_GUEST_TLB_SIZE 64 +struct kvm_vcpu_arch { + void *host_ebase, *guest_ebase; + unsigned long host_stack; + unsigned long host_gp; + + /* Host CP0 registers used when handling exits from guest */ + unsigned long host_cp0_badvaddr; + unsigned long host_cp0_cause; + unsigned long host_cp0_epc; + unsigned long host_cp0_entryhi; + uint32_t guest_inst; + + /* GPRS */ + unsigned long gprs[32]; + unsigned long hi; + unsigned long lo; + unsigned long pc; + + /* FPU State */ + struct mips_fpu_struct fpu; + + /* COP0 State */ + struct mips_coproc *cop0; + + /* Host KSEG0 address of the EI/DI offset */ + void *kseg0_commpage; + + u32 io_gpr; /* GPR used as IO source/target */ + + /* Used to calibrate the virutal count register for the guest */ + int32_t host_cp0_count; + + /* Bitmask of exceptions that are pending */ + unsigned long pending_exceptions; + + /* Bitmask of pending exceptions to be cleared */ + unsigned long pending_exceptions_clr; + + unsigned long pending_load_cause; + + /* Save/Restore the entryhi register when are are preempted/scheduled back in */ + unsigned long preempt_entryhi; + + /* S/W Based TLB for guest */ + struct kvm_mips_tlb guest_tlb[KVM_MIPS_GUEST_TLB_SIZE]; + + /* Cached guest kernel/user ASIDs */ + uint32_t guest_user_asid[NR_CPUS]; + uint32_t guest_kernel_asid[NR_CPUS]; + struct mm_struct guest_kernel_mm, guest_user_mm; + + struct kvm_mips_tlb shadow_tlb[NR_CPUS][KVM_MIPS_GUEST_TLB_SIZE]; + + + struct hrtimer comparecount_timer; + + int last_sched_cpu; + + /* WAIT executed */ + int wait; +}; + + +#define kvm_read_c0_guest_index(cop0) (cop0->reg[MIPS_CP0_TLB_INDEX][0]) +#define kvm_write_c0_guest_index(cop0, val) (cop0->reg[MIPS_CP0_TLB_INDEX][0] = val) +#define kvm_read_c0_guest_entrylo0(cop0) (cop0->reg[MIPS_CP0_TLB_LO0][0]) +#define kvm_read_c0_guest_entrylo1(cop0) (cop0->reg[MIPS_CP0_TLB_LO1][0]) +#define kvm_read_c0_guest_context(cop0) (cop0->reg[MIPS_CP0_TLB_CONTEXT][0]) +#define kvm_write_c0_guest_context(cop0, val) (cop0->reg[MIPS_CP0_TLB_CONTEXT][0] = (val)) +#define kvm_read_c0_guest_userlocal(cop0) (cop0->reg[MIPS_CP0_TLB_CONTEXT][2]) +#define kvm_read_c0_guest_pagemask(cop0) (cop0->reg[MIPS_CP0_TLB_PG_MASK][0]) +#define kvm_write_c0_guest_pagemask(cop0, val) (cop0->reg[MIPS_CP0_TLB_PG_MASK][0] = (val)) +#define kvm_read_c0_guest_wired(cop0) (cop0->reg[MIPS_CP0_TLB_WIRED][0]) +#define kvm_write_c0_guest_wired(cop0, val) (cop0->reg[MIPS_CP0_TLB_WIRED][0] = (val)) +#define kvm_read_c0_guest_badvaddr(cop0) (cop0->reg[MIPS_CP0_BAD_VADDR][0]) +#define kvm_write_c0_guest_badvaddr(cop0, val) (cop0->reg[MIPS_CP0_BAD_VADDR][0] = (val)) +#define kvm_read_c0_guest_count(cop0) (cop0->reg[MIPS_CP0_COUNT][0]) +#define kvm_write_c0_guest_count(cop0, val) (cop0->reg[MIPS_CP0_COUNT][0] = (val)) +#define kvm_read_c0_guest_entryhi(cop0) (cop0->reg[MIPS_CP0_TLB_HI][0]) +#define kvm_write_c0_guest_entryhi(cop0, val) (cop0->reg[MIPS_CP0_TLB_HI][0] = (val)) +#define kvm_read_c0_guest_compare(cop0) (cop0->reg[MIPS_CP0_COMPARE][0]) +#define kvm_write_c0_guest_compare(cop0, val) (cop0->reg[MIPS_CP0_COMPARE][0] = (val)) +#define kvm_read_c0_guest_status(cop0) (cop0->reg[MIPS_CP0_STATUS][0]) +#define kvm_write_c0_guest_status(cop0, val) (cop0->reg[MIPS_CP0_STATUS][0] = (val)) +#define kvm_read_c0_guest_intctl(cop0) (cop0->reg[MIPS_CP0_STATUS][1]) +#define kvm_write_c0_guest_intctl(cop0, val) (cop0->reg[MIPS_CP0_STATUS][1] = (val)) +#define kvm_read_c0_guest_cause(cop0) (cop0->reg[MIPS_CP0_CAUSE][0]) +#define kvm_write_c0_guest_cause(cop0, val) (cop0->reg[MIPS_CP0_CAUSE][0] = (val)) +#define kvm_read_c0_guest_epc(cop0) (cop0->reg[MIPS_CP0_EXC_PC][0]) +#define kvm_write_c0_guest_epc(cop0, val) (cop0->reg[MIPS_CP0_EXC_PC][0] = (val)) +#define kvm_read_c0_guest_prid(cop0) (cop0->reg[MIPS_CP0_PRID][0]) +#define kvm_write_c0_guest_prid(cop0, val) (cop0->reg[MIPS_CP0_PRID][0] = (val)) +#define kvm_read_c0_guest_ebase(cop0) (cop0->reg[MIPS_CP0_PRID][1]) +#define kvm_write_c0_guest_ebase(cop0, val) (cop0->reg[MIPS_CP0_PRID][1] = (val)) +#define kvm_read_c0_guest_config(cop0) (cop0->reg[MIPS_CP0_CONFIG][0]) +#define kvm_read_c0_guest_config1(cop0) (cop0->reg[MIPS_CP0_CONFIG][1]) +#define kvm_read_c0_guest_config2(cop0) (cop0->reg[MIPS_CP0_CONFIG][2]) +#define kvm_read_c0_guest_config3(cop0) (cop0->reg[MIPS_CP0_CONFIG][3]) +#define kvm_read_c0_guest_config7(cop0) (cop0->reg[MIPS_CP0_CONFIG][7]) +#define kvm_write_c0_guest_config(cop0, val) (cop0->reg[MIPS_CP0_CONFIG][0] = (val)) +#define kvm_write_c0_guest_config1(cop0, val) (cop0->reg[MIPS_CP0_CONFIG][1] = (val)) +#define kvm_write_c0_guest_config2(cop0, val) (cop0->reg[MIPS_CP0_CONFIG][2] = (val)) +#define kvm_write_c0_guest_config3(cop0, val) (cop0->reg[MIPS_CP0_CONFIG][3] = (val)) +#define kvm_write_c0_guest_config7(cop0, val) (cop0->reg[MIPS_CP0_CONFIG][7] = (val)) +#define kvm_read_c0_guest_errorepc(cop0) (cop0->reg[MIPS_CP0_ERROR_PC][0]) +#define kvm_write_c0_guest_errorepc(cop0, val) (cop0->reg[MIPS_CP0_ERROR_PC][0] = (val)) + +#define kvm_set_c0_guest_status(cop0, val) (cop0->reg[MIPS_CP0_STATUS][0] |= (val)) +#define kvm_clear_c0_guest_status(cop0, val) (cop0->reg[MIPS_CP0_STATUS][0] &= ~(val)) +#define kvm_set_c0_guest_cause(cop0, val) (cop0->reg[MIPS_CP0_CAUSE][0] |= (val)) +#define kvm_clear_c0_guest_cause(cop0, val) (cop0->reg[MIPS_CP0_CAUSE][0] &= ~(val)) +#define kvm_change_c0_guest_cause(cop0, change, val) \ +{ \ + kvm_clear_c0_guest_cause(cop0, change); \ + kvm_set_c0_guest_cause(cop0, ((val) & (change))); \ +} +#define kvm_set_c0_guest_ebase(cop0, val) (cop0->reg[MIPS_CP0_PRID][1] |= (val)) +#define kvm_clear_c0_guest_ebase(cop0, val) (cop0->reg[MIPS_CP0_PRID][1] &= ~(val)) +#define kvm_change_c0_guest_ebase(cop0, change, val) \ +{ \ + kvm_clear_c0_guest_ebase(cop0, change); \ + kvm_set_c0_guest_ebase(cop0, ((val) & (change))); \ +} + + +struct kvm_mips_callbacks { + int (*handle_cop_unusable) (struct kvm_vcpu *vcpu); + int (*handle_tlb_mod) (struct kvm_vcpu *vcpu); + int (*handle_tlb_ld_miss) (struct kvm_vcpu *vcpu); + int (*handle_tlb_st_miss) (struct kvm_vcpu *vcpu); + int (*handle_addr_err_st) (struct kvm_vcpu *vcpu); + int (*handle_addr_err_ld) (struct kvm_vcpu *vcpu); + int (*handle_syscall) (struct kvm_vcpu *vcpu); + int (*handle_res_inst) (struct kvm_vcpu *vcpu); + int (*handle_break) (struct kvm_vcpu *vcpu); + int (*vm_init) (struct kvm *kvm); + int (*vcpu_init) (struct kvm_vcpu *vcpu); + int (*vcpu_setup) (struct kvm_vcpu *vcpu); + gpa_t(*gva_to_gpa) (gva_t gva); + void (*queue_timer_int) (struct kvm_vcpu *vcpu); + void (*dequeue_timer_int) (struct kvm_vcpu *vcpu); + void (*queue_io_int) (struct kvm_vcpu *vcpu, + struct kvm_mips_interrupt *irq); + void (*dequeue_io_int) (struct kvm_vcpu *vcpu, + struct kvm_mips_interrupt *irq); + int (*irq_deliver) (struct kvm_vcpu *vcpu, unsigned int priority, + uint32_t cause); + int (*irq_clear) (struct kvm_vcpu *vcpu, unsigned int priority, + uint32_t cause); + int (*vcpu_ioctl_get_regs) (struct kvm_vcpu *vcpu, + struct kvm_regs *regs); + int (*vcpu_ioctl_set_regs) (struct kvm_vcpu *vcpu, + struct kvm_regs *regs); +}; +extern struct kvm_mips_callbacks *kvm_mips_callbacks; +int kvm_mips_emulation_init(struct kvm_mips_callbacks **install_callbacks); + +/* Debug: dump vcpu state */ +int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu); + +/* Trampoline ASM routine to start running in "Guest" context */ +extern int __kvm_mips_vcpu_run(struct kvm_run *run, struct kvm_vcpu *vcpu); + +/* TLB handling */ +uint32_t kvm_get_kernel_asid(struct kvm_vcpu *vcpu); + +uint32_t kvm_get_user_asid(struct kvm_vcpu *vcpu); + +uint32_t kvm_get_commpage_asid (struct kvm_vcpu *vcpu); + +extern int kvm_mips_handle_kseg0_tlb_fault(unsigned long badbaddr, + struct kvm_vcpu *vcpu); + +extern int kvm_mips_handle_commpage_tlb_fault(unsigned long badvaddr, + struct kvm_vcpu *vcpu); + +extern int kvm_mips_handle_mapped_seg_tlb_fault(struct kvm_vcpu *vcpu, + struct kvm_mips_tlb *tlb, + unsigned long *hpa0, + unsigned long *hpa1); + +extern enum emulation_result kvm_mips_handle_tlbmiss(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_handle_tlbmod(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern void kvm_mips_dump_host_tlbs(void); +extern void kvm_mips_dump_guest_tlbs(struct kvm_vcpu *vcpu); +extern void kvm_mips_dump_shadow_tlbs(struct kvm_vcpu *vcpu); +extern void kvm_mips_flush_host_tlb(int skip_kseg0); +extern int kvm_mips_host_tlb_inv(struct kvm_vcpu *vcpu, unsigned long entryhi); +extern int kvm_mips_host_tlb_inv_index(struct kvm_vcpu *vcpu, int index); + +extern int kvm_mips_guest_tlb_lookup(struct kvm_vcpu *vcpu, + unsigned long entryhi); +extern int kvm_mips_host_tlb_lookup(struct kvm_vcpu *vcpu, unsigned long vaddr); +extern unsigned long kvm_mips_translate_guest_kseg0_to_hpa(struct kvm_vcpu *vcpu, + unsigned long gva); +extern void kvm_get_new_mmu_context(struct mm_struct *mm, unsigned long cpu, + struct kvm_vcpu *vcpu); +extern void kvm_shadow_tlb_put(struct kvm_vcpu *vcpu); +extern void kvm_shadow_tlb_load(struct kvm_vcpu *vcpu); +extern void kvm_local_flush_tlb_all(void); +extern void kvm_mips_init_shadow_tlb(struct kvm_vcpu *vcpu); +extern void kvm_mips_alloc_new_mmu_context(struct kvm_vcpu *vcpu); +extern void kvm_mips_vcpu_load(struct kvm_vcpu *vcpu, int cpu); +extern void kvm_mips_vcpu_put(struct kvm_vcpu *vcpu); + +/* Emulation */ +uint32_t kvm_get_inst(uint32_t *opc, struct kvm_vcpu *vcpu); +enum emulation_result update_pc(struct kvm_vcpu *vcpu, uint32_t cause); + +extern enum emulation_result kvm_mips_emulate_inst(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_syscall(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_tlbmiss_ld(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_tlbinv_ld(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_tlbmiss_st(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_tlbinv_st(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_tlbmod(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_fpu_exc(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_handle_ri(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_ri_exc(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_emulate_bp_exc(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +extern enum emulation_result kvm_mips_complete_mmio_load(struct kvm_vcpu *vcpu, + struct kvm_run *run); + +enum emulation_result kvm_mips_emulate_count(struct kvm_vcpu *vcpu); + +enum emulation_result kvm_mips_check_privilege(unsigned long cause, + uint32_t *opc, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +enum emulation_result kvm_mips_emulate_cache(uint32_t inst, + uint32_t *opc, + uint32_t cause, + struct kvm_run *run, + struct kvm_vcpu *vcpu); +enum emulation_result kvm_mips_emulate_CP0(uint32_t inst, + uint32_t *opc, + uint32_t cause, + struct kvm_run *run, + struct kvm_vcpu *vcpu); +enum emulation_result kvm_mips_emulate_store(uint32_t inst, + uint32_t cause, + struct kvm_run *run, + struct kvm_vcpu *vcpu); +enum emulation_result kvm_mips_emulate_load(uint32_t inst, + uint32_t cause, + struct kvm_run *run, + struct kvm_vcpu *vcpu); + +/* Dynamic binary translation */ +extern int kvm_mips_trans_cache_index(uint32_t inst, uint32_t *opc, + struct kvm_vcpu *vcpu); +extern int kvm_mips_trans_cache_va(uint32_t inst, uint32_t *opc, + struct kvm_vcpu *vcpu); +extern int kvm_mips_trans_mfc0(uint32_t inst, uint32_t *opc, + struct kvm_vcpu *vcpu); +extern int kvm_mips_trans_mtc0(uint32_t inst, uint32_t *opc, + struct kvm_vcpu *vcpu); + +/* Misc */ +extern void mips32_SyncICache(unsigned long addr, unsigned long size); +extern int kvm_mips_dump_stats(struct kvm_vcpu *vcpu); +extern unsigned long kvm_mips_get_ramsize(struct kvm *kvm); + + +#endif /* __MIPS_KVM_HOST_H__ */ diff --git a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_clk.h b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_clk.h deleted file mode 100644 index 8fcf8df4418a..000000000000 --- a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_clk.h +++ /dev/null @@ -1,11 +0,0 @@ -#ifndef BCM63XX_CLK_H_ -#define BCM63XX_CLK_H_ - -struct clk { - void (*set)(struct clk *, int); - unsigned int rate; - unsigned int usage; - int id; -}; - -#endif /* ! BCM63XX_CLK_H_ */ diff --git a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_cpu.h b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_cpu.h index cb922b9cb0e9..336228990808 100644 --- a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_cpu.h +++ b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_cpu.h @@ -14,11 +14,12 @@ #define BCM6345_CPU_ID 0x6345 #define BCM6348_CPU_ID 0x6348 #define BCM6358_CPU_ID 0x6358 +#define BCM6362_CPU_ID 0x6362 #define BCM6368_CPU_ID 0x6368 void __init bcm63xx_cpu_init(void); u16 __bcm63xx_get_cpu_id(void); -u16 bcm63xx_get_cpu_rev(void); +u8 bcm63xx_get_cpu_rev(void); unsigned int bcm63xx_get_cpu_freq(void); #ifdef CONFIG_BCM63XX_CPU_6328 @@ -86,6 +87,20 @@ unsigned int bcm63xx_get_cpu_freq(void); # define BCMCPU_IS_6358() (0) #endif +#ifdef CONFIG_BCM63XX_CPU_6362 +# ifdef bcm63xx_get_cpu_id +# undef bcm63xx_get_cpu_id +# define bcm63xx_get_cpu_id() __bcm63xx_get_cpu_id() +# define BCMCPU_RUNTIME_DETECT +# else +# define bcm63xx_get_cpu_id() BCM6362_CPU_ID +# endif +# define BCMCPU_IS_6362() (bcm63xx_get_cpu_id() == BCM6362_CPU_ID) +#else +# define BCMCPU_IS_6362() (0) +#endif + + #ifdef CONFIG_BCM63XX_CPU_6368 # ifdef bcm63xx_get_cpu_id # undef bcm63xx_get_cpu_id @@ -405,6 +420,62 @@ enum bcm63xx_regs_set { #define BCM_6358_MISC_BASE (0xdeadbeef) +/* + * 6362 register sets base address + */ +#define BCM_6362_DSL_LMEM_BASE (0xdeadbeef) +#define BCM_6362_PERF_BASE (0xb0000000) +#define BCM_6362_TIMER_BASE (0xb0000040) +#define BCM_6362_WDT_BASE (0xb000005c) +#define BCM_6362_UART0_BASE (0xb0000100) +#define BCM_6362_UART1_BASE (0xb0000120) +#define BCM_6362_GPIO_BASE (0xb0000080) +#define BCM_6362_SPI_BASE (0xb0000800) +#define BCM_6362_HSSPI_BASE (0xb0001000) +#define BCM_6362_UDC0_BASE (0xdeadbeef) +#define BCM_6362_USBDMA_BASE (0xb000c000) +#define BCM_6362_OHCI0_BASE (0xb0002600) +#define BCM_6362_OHCI_PRIV_BASE (0xdeadbeef) +#define BCM_6362_USBH_PRIV_BASE (0xb0002700) +#define BCM_6362_USBD_BASE (0xb0002400) +#define BCM_6362_MPI_BASE (0xdeadbeef) +#define BCM_6362_PCMCIA_BASE (0xdeadbeef) +#define BCM_6362_PCIE_BASE (0xb0e40000) +#define BCM_6362_SDRAM_REGS_BASE (0xdeadbeef) +#define BCM_6362_DSL_BASE (0xdeadbeef) +#define BCM_6362_UBUS_BASE (0xdeadbeef) +#define BCM_6362_ENET0_BASE (0xdeadbeef) +#define BCM_6362_ENET1_BASE (0xdeadbeef) +#define BCM_6362_ENETDMA_BASE (0xb000d800) +#define BCM_6362_ENETDMAC_BASE (0xb000da00) +#define BCM_6362_ENETDMAS_BASE (0xb000dc00) +#define BCM_6362_ENETSW_BASE (0xb0e00000) +#define BCM_6362_EHCI0_BASE (0xb0002500) +#define BCM_6362_SDRAM_BASE (0xdeadbeef) +#define BCM_6362_MEMC_BASE (0xdeadbeef) +#define BCM_6362_DDR_BASE (0xb0003000) +#define BCM_6362_M2M_BASE (0xdeadbeef) +#define BCM_6362_ATM_BASE (0xdeadbeef) +#define BCM_6362_XTM_BASE (0xb0007800) +#define BCM_6362_XTMDMA_BASE (0xb000b800) +#define BCM_6362_XTMDMAC_BASE (0xdeadbeef) +#define BCM_6362_XTMDMAS_BASE (0xdeadbeef) +#define BCM_6362_PCM_BASE (0xb000a800) +#define BCM_6362_PCMDMA_BASE (0xdeadbeef) +#define BCM_6362_PCMDMAC_BASE (0xdeadbeef) +#define BCM_6362_PCMDMAS_BASE (0xdeadbeef) +#define BCM_6362_RNG_BASE (0xdeadbeef) +#define BCM_6362_MISC_BASE (0xb0001800) + +#define BCM_6362_NAND_REG_BASE (0xb0000200) +#define BCM_6362_NAND_CACHE_BASE (0xb0000600) +#define BCM_6362_LED_BASE (0xb0001900) +#define BCM_6362_IPSEC_BASE (0xb0002800) +#define BCM_6362_IPSEC_DMA_BASE (0xb000d000) +#define BCM_6362_WLAN_CHIPCOMMON_BASE (0xb0004000) +#define BCM_6362_WLAN_D11_BASE (0xb0005000) +#define BCM_6362_WLAN_SHIM_BASE (0xb0007000) + /* * 6368 register sets base address */ @@ -564,6 +635,9 @@ static inline unsigned long bcm63xx_regset_address(enum bcm63xx_regs_set set) #ifdef CONFIG_BCM63XX_CPU_6358 __GEN_RSET(6358) #endif +#ifdef CONFIG_BCM63XX_CPU_6362 + __GEN_RSET(6362) +#endif #ifdef CONFIG_BCM63XX_CPU_6368 __GEN_RSET(6368) #endif @@ -819,6 +893,71 @@ enum bcm63xx_irq { #define BCM_6358_EXT_IRQ2 (IRQ_INTERNAL_BASE + 27) #define BCM_6358_EXT_IRQ3 (IRQ_INTERNAL_BASE + 28) +/* + * 6362 irqs + */ +#define BCM_6362_HIGH_IRQ_BASE (IRQ_INTERNAL_BASE + 32) + +#define BCM_6362_TIMER_IRQ (IRQ_INTERNAL_BASE + 0) +#define BCM_6362_SPI_IRQ (IRQ_INTERNAL_BASE + 2) +#define BCM_6362_UART0_IRQ (IRQ_INTERNAL_BASE + 3) +#define BCM_6362_UART1_IRQ (IRQ_INTERNAL_BASE + 4) +#define BCM_6362_DSL_IRQ (IRQ_INTERNAL_BASE + 28) +#define BCM_6362_UDC0_IRQ 0 +#define BCM_6362_ENET0_IRQ 0 +#define BCM_6362_ENET1_IRQ 0 +#define BCM_6362_ENET_PHY_IRQ (IRQ_INTERNAL_BASE + 14) +#define BCM_6362_HSSPI_IRQ (IRQ_INTERNAL_BASE + 5) +#define BCM_6362_OHCI0_IRQ (IRQ_INTERNAL_BASE + 9) +#define BCM_6362_EHCI0_IRQ (IRQ_INTERNAL_BASE + 10) +#define BCM_6362_USBD_IRQ (IRQ_INTERNAL_BASE + 11) +#define BCM_6362_USBD_RXDMA0_IRQ (IRQ_INTERNAL_BASE + 20) +#define BCM_6362_USBD_TXDMA0_IRQ (IRQ_INTERNAL_BASE + 21) +#define BCM_6362_USBD_RXDMA1_IRQ (IRQ_INTERNAL_BASE + 22) +#define BCM_6362_USBD_TXDMA1_IRQ (IRQ_INTERNAL_BASE + 23) +#define BCM_6362_USBD_RXDMA2_IRQ (IRQ_INTERNAL_BASE + 24) +#define BCM_6362_USBD_TXDMA2_IRQ (IRQ_INTERNAL_BASE + 25) +#define BCM_6362_PCMCIA_IRQ 0 +#define BCM_6362_ENET0_RXDMA_IRQ 0 +#define BCM_6362_ENET0_TXDMA_IRQ 0 +#define BCM_6362_ENET1_RXDMA_IRQ 0 +#define BCM_6362_ENET1_TXDMA_IRQ 0 +#define BCM_6362_PCI_IRQ (IRQ_INTERNAL_BASE + 30) +#define BCM_6362_ATM_IRQ 0 +#define BCM_6362_ENETSW_RXDMA0_IRQ (BCM_6362_HIGH_IRQ_BASE + 0) +#define BCM_6362_ENETSW_RXDMA1_IRQ (BCM_6362_HIGH_IRQ_BASE + 1) +#define BCM_6362_ENETSW_RXDMA2_IRQ (BCM_6362_HIGH_IRQ_BASE + 2) +#define BCM_6362_ENETSW_RXDMA3_IRQ (BCM_6362_HIGH_IRQ_BASE + 3) +#define BCM_6362_ENETSW_TXDMA0_IRQ 0 +#define BCM_6362_ENETSW_TXDMA1_IRQ 0 +#define BCM_6362_ENETSW_TXDMA2_IRQ 0 +#define BCM_6362_ENETSW_TXDMA3_IRQ 0 +#define BCM_6362_XTM_IRQ 0 +#define BCM_6362_XTM_DMA0_IRQ (BCM_6362_HIGH_IRQ_BASE + 12) + +#define BCM_6362_RING_OSC_IRQ (IRQ_INTERNAL_BASE + 1) +#define BCM_6362_WLAN_GPIO_IRQ (IRQ_INTERNAL_BASE + 6) +#define BCM_6362_WLAN_IRQ (IRQ_INTERNAL_BASE + 7) +#define BCM_6362_IPSEC_IRQ (IRQ_INTERNAL_BASE + 8) +#define BCM_6362_NAND_IRQ (IRQ_INTERNAL_BASE + 12) +#define BCM_6362_PCM_IRQ (IRQ_INTERNAL_BASE + 13) +#define BCM_6362_DG_IRQ (IRQ_INTERNAL_BASE + 15) +#define BCM_6362_EPHY_ENERGY0_IRQ (IRQ_INTERNAL_BASE + 16) +#define BCM_6362_EPHY_ENERGY1_IRQ (IRQ_INTERNAL_BASE + 17) +#define BCM_6362_EPHY_ENERGY2_IRQ (IRQ_INTERNAL_BASE + 18) +#define BCM_6362_EPHY_ENERGY3_IRQ (IRQ_INTERNAL_BASE + 19) +#define BCM_6362_IPSEC_DMA0_IRQ (IRQ_INTERNAL_BASE + 26) +#define BCM_6362_IPSEC_DMA1_IRQ (IRQ_INTERNAL_BASE + 27) +#define BCM_6362_FAP0_IRQ (IRQ_INTERNAL_BASE + 29) +#define BCM_6362_PCM_DMA0_IRQ (BCM_6362_HIGH_IRQ_BASE + 4) +#define BCM_6362_PCM_DMA1_IRQ (BCM_6362_HIGH_IRQ_BASE + 5) +#define BCM_6362_DECT0_IRQ (BCM_6362_HIGH_IRQ_BASE + 6) +#define BCM_6362_DECT1_IRQ (BCM_6362_HIGH_IRQ_BASE + 7) +#define BCM_6362_EXT_IRQ0 (BCM_6362_HIGH_IRQ_BASE + 8) +#define BCM_6362_EXT_IRQ1 (BCM_6362_HIGH_IRQ_BASE + 9) +#define BCM_6362_EXT_IRQ2 (BCM_6362_HIGH_IRQ_BASE + 10) +#define BCM_6362_EXT_IRQ3 (BCM_6362_HIGH_IRQ_BASE + 11) + /* * 6368 irqs */ diff --git a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_dev_spi.h b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_dev_spi.h index b0184cf02575..c426cabc620a 100644 --- a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_dev_spi.h +++ b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_dev_spi.h @@ -71,18 +71,13 @@ static inline unsigned long bcm63xx_spireg(enum bcm63xx_regs_spi reg) return bcm63xx_regs_spi[reg]; #else -#ifdef CONFIG_BCM63XX_CPU_6338 - __GEN_SPI_RSET(6338) -#endif -#ifdef CONFIG_BCM63XX_CPU_6348 +#if defined(CONFIG_BCM63XX_CPU_6338) || defined(CONFIG_BCM63XX_CPU_6348) __GEN_SPI_RSET(6348) #endif -#ifdef CONFIG_BCM63XX_CPU_6358 +#if defined(CONFIG_BCM63XX_CPU_6358) || defined(CONFIG_BCM63XX_CPU_6362) || \ + defined(CONFIG_BCM63XX_CPU_6368) __GEN_SPI_RSET(6358) #endif -#ifdef CONFIG_BCM63XX_CPU_6368 - __GEN_SPI_RSET(6368) -#endif #endif return 0; } diff --git a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_gpio.h b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_gpio.h index 0a9891f7580d..35baa1a60a64 100644 --- a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_gpio.h +++ b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_gpio.h @@ -17,6 +17,8 @@ static inline unsigned long bcm63xx_gpio_count(void) return 8; case BCM6345_CPU_ID: return 16; + case BCM6362_CPU_ID: + return 48; case BCM6368_CPU_ID: return 38; case BCM6348_CPU_ID: diff --git a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_regs.h b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_regs.h index 81b4702f792a..3203fe49b34d 100644 --- a/arch/mips/include/asm/mach-bcm63xx/bcm63xx_regs.h +++ b/arch/mips/include/asm/mach-bcm63xx/bcm63xx_regs.h @@ -10,7 +10,7 @@ #define REV_CHIPID_SHIFT 16 #define REV_CHIPID_MASK (0xffff << REV_CHIPID_SHIFT) #define REV_REVID_SHIFT 0 -#define REV_REVID_MASK (0xffff << REV_REVID_SHIFT) +#define REV_REVID_MASK (0xff << REV_REVID_SHIFT) /* Clock Control register */ #define PERF_CKCTL_REG 0x4 @@ -112,6 +112,39 @@ CKCTL_6358_USBSU_EN | \ CKCTL_6358_EPHY_EN) +#define CKCTL_6362_ADSL_QPROC_EN (1 << 1) +#define CKCTL_6362_ADSL_AFE_EN (1 << 2) +#define CKCTL_6362_ADSL_EN (1 << 3) +#define CKCTL_6362_MIPS_EN (1 << 4) +#define CKCTL_6362_WLAN_OCP_EN (1 << 5) +#define CKCTL_6362_SWPKT_USB_EN (1 << 7) +#define CKCTL_6362_SWPKT_SAR_EN (1 << 8) +#define CKCTL_6362_SAR_EN (1 << 9) +#define CKCTL_6362_ROBOSW_EN (1 << 10) +#define CKCTL_6362_PCM_EN (1 << 11) +#define CKCTL_6362_USBD_EN (1 << 12) +#define CKCTL_6362_USBH_EN (1 << 13) +#define CKCTL_6362_IPSEC_EN (1 << 14) +#define CKCTL_6362_SPI_EN (1 << 15) +#define CKCTL_6362_HSSPI_EN (1 << 16) +#define CKCTL_6362_PCIE_EN (1 << 17) +#define CKCTL_6362_FAP_EN (1 << 18) +#define CKCTL_6362_PHYMIPS_EN (1 << 19) +#define CKCTL_6362_NAND_EN (1 << 20) + +#define CKCTL_6362_ALL_SAFE_EN (CKCTL_6362_PHYMIPS_EN | \ + CKCTL_6362_ADSL_QPROC_EN | \ + CKCTL_6362_ADSL_AFE_EN | \ + CKCTL_6362_ADSL_EN | \ + CKCTL_6362_SAR_EN | \ + CKCTL_6362_PCM_EN | \ + CKCTL_6362_IPSEC_EN | \ + CKCTL_6362_USBD_EN | \ + CKCTL_6362_USBH_EN | \ + CKCTL_6362_ROBOSW_EN | \ + CKCTL_6362_PCIE_EN) + + #define CKCTL_6368_VDSL_QPROC_EN (1 << 2) #define CKCTL_6368_VDSL_AFE_EN (1 << 3) #define CKCTL_6368_VDSL_BONDING_EN (1 << 4) @@ -153,6 +186,7 @@ #define PERF_IRQMASK_6345_REG 0xc #define PERF_IRQMASK_6348_REG 0xc #define PERF_IRQMASK_6358_REG 0xc +#define PERF_IRQMASK_6362_REG 0x20 #define PERF_IRQMASK_6368_REG 0x20 /* Interrupt Status register */ @@ -161,6 +195,7 @@ #define PERF_IRQSTAT_6345_REG 0x10 #define PERF_IRQSTAT_6348_REG 0x10 #define PERF_IRQSTAT_6358_REG 0x10 +#define PERF_IRQSTAT_6362_REG 0x28 #define PERF_IRQSTAT_6368_REG 0x28 /* External Interrupt Configuration register */ @@ -169,6 +204,7 @@ #define PERF_EXTIRQ_CFG_REG_6345 0x14 #define PERF_EXTIRQ_CFG_REG_6348 0x14 #define PERF_EXTIRQ_CFG_REG_6358 0x14 +#define PERF_EXTIRQ_CFG_REG_6362 0x18 #define PERF_EXTIRQ_CFG_REG_6368 0x18 #define PERF_EXTIRQ_CFG_REG2_6368 0x1c @@ -197,6 +233,7 @@ #define PERF_SOFTRESET_REG 0x28 #define PERF_SOFTRESET_6328_REG 0x10 #define PERF_SOFTRESET_6358_REG 0x34 +#define PERF_SOFTRESET_6362_REG 0x10 #define PERF_SOFTRESET_6368_REG 0x10 #define SOFTRESET_6328_SPI_MASK (1 << 0) @@ -259,6 +296,22 @@ #define SOFTRESET_6358_PCM_MASK (1 << 13) #define SOFTRESET_6358_ADSL_MASK (1 << 14) +#define SOFTRESET_6362_SPI_MASK (1 << 0) +#define SOFTRESET_6362_IPSEC_MASK (1 << 1) +#define SOFTRESET_6362_EPHY_MASK (1 << 2) +#define SOFTRESET_6362_SAR_MASK (1 << 3) +#define SOFTRESET_6362_ENETSW_MASK (1 << 4) +#define SOFTRESET_6362_USBS_MASK (1 << 5) +#define SOFTRESET_6362_USBH_MASK (1 << 6) +#define SOFTRESET_6362_PCM_MASK (1 << 7) +#define SOFTRESET_6362_PCIE_CORE_MASK (1 << 8) +#define SOFTRESET_6362_PCIE_MASK (1 << 9) +#define SOFTRESET_6362_PCIE_EXT_MASK (1 << 10) +#define SOFTRESET_6362_WLAN_SHIM_MASK (1 << 11) +#define SOFTRESET_6362_DDR_PHY_MASK (1 << 12) +#define SOFTRESET_6362_FAP_MASK (1 << 13) +#define SOFTRESET_6362_WLAN_UBUS_MASK (1 << 14) + #define SOFTRESET_6368_SPI_MASK (1 << 0) #define SOFTRESET_6368_MPI_MASK (1 << 3) #define SOFTRESET_6368_EPHY_MASK (1 << 6) @@ -1223,24 +1276,7 @@ * _REG relative to RSET_SPI *************************************************************************/ -/* BCM 6338 SPI core */ -#define SPI_6338_CMD 0x00 /* 16-bits register */ -#define SPI_6338_INT_STATUS 0x02 -#define SPI_6338_INT_MASK_ST 0x03 -#define SPI_6338_INT_MASK 0x04 -#define SPI_6338_ST 0x05 -#define SPI_6338_CLK_CFG 0x06 -#define SPI_6338_FILL_BYTE 0x07 -#define SPI_6338_MSG_TAIL 0x09 -#define SPI_6338_RX_TAIL 0x0b -#define SPI_6338_MSG_CTL 0x40 /* 8-bits register */ -#define SPI_6338_MSG_CTL_WIDTH 8 -#define SPI_6338_MSG_DATA 0x41 -#define SPI_6338_MSG_DATA_SIZE 0x3f -#define SPI_6338_RX_DATA 0x80 -#define SPI_6338_RX_DATA_SIZE 0x3f - -/* BCM 6348 SPI core */ +/* BCM 6338/6348 SPI core */ #define SPI_6348_CMD 0x00 /* 16-bits register */ #define SPI_6348_INT_STATUS 0x02 #define SPI_6348_INT_MASK_ST 0x03 @@ -1257,7 +1293,7 @@ #define SPI_6348_RX_DATA 0x80 #define SPI_6348_RX_DATA_SIZE 0x3f -/* BCM 6358 SPI core */ +/* BCM 6358/6262/6368 SPI core */ #define SPI_6358_MSG_CTL 0x00 /* 16-bits register */ #define SPI_6358_MSG_CTL_WIDTH 16 #define SPI_6358_MSG_DATA 0x02 @@ -1274,23 +1310,6 @@ #define SPI_6358_MSG_TAIL 0x709 #define SPI_6358_RX_TAIL 0x70B -/* BCM 6358 SPI core */ -#define SPI_6368_MSG_CTL 0x00 /* 16-bits register */ -#define SPI_6368_MSG_CTL_WIDTH 16 -#define SPI_6368_MSG_DATA 0x02 -#define SPI_6368_MSG_DATA_SIZE 0x21e -#define SPI_6368_RX_DATA 0x400 -#define SPI_6368_RX_DATA_SIZE 0x220 -#define SPI_6368_CMD 0x700 /* 16-bits register */ -#define SPI_6368_INT_STATUS 0x702 -#define SPI_6368_INT_MASK_ST 0x703 -#define SPI_6368_INT_MASK 0x704 -#define SPI_6368_ST 0x705 -#define SPI_6368_CLK_CFG 0x706 -#define SPI_6368_FILL_BYTE 0x707 -#define SPI_6368_MSG_TAIL 0x709 -#define SPI_6368_RX_TAIL 0x70B - /* Shared SPI definitions */ /* Message configuration */ @@ -1298,10 +1317,8 @@ #define SPI_HD_W 0x01 #define SPI_HD_R 0x02 #define SPI_BYTE_CNT_SHIFT 0 -#define SPI_6338_MSG_TYPE_SHIFT 6 #define SPI_6348_MSG_TYPE_SHIFT 6 #define SPI_6358_MSG_TYPE_SHIFT 14 -#define SPI_6368_MSG_TYPE_SHIFT 14 /* Command */ #define SPI_CMD_NOOP 0x00 @@ -1348,10 +1365,18 @@ /************************************************************************* * _REG relative to RSET_MISC *************************************************************************/ -#define MISC_SERDES_CTRL_REG 0x0 +#define MISC_SERDES_CTRL_6328_REG 0x0 +#define MISC_SERDES_CTRL_6362_REG 0x4 #define SERDES_PCIE_EN (1 << 0) #define SERDES_PCIE_EXD_EN (1 << 15) +#define MISC_STRAPBUS_6362_REG 0x14 +#define STRAPBUS_6362_FCVO_SHIFT 1 +#define STRAPBUS_6362_HSSPI_CLK_FAST (1 << 13) +#define STRAPBUS_6362_FCVO_MASK (0x1f << STRAPBUS_6362_FCVO_SHIFT) +#define STRAPBUS_6362_BOOT_SEL_SERIAL (1 << 15) +#define STRAPBUS_6362_BOOT_SEL_NAND (0 << 15) + #define MISC_STRAPBUS_6328_REG 0x240 #define STRAPBUS_6328_FCVO_SHIFT 7 #define STRAPBUS_6328_FCVO_MASK (0x1f << STRAPBUS_6328_FCVO_SHIFT) diff --git a/arch/mips/include/asm/mach-bcm63xx/ioremap.h b/arch/mips/include/asm/mach-bcm63xx/ioremap.h index 30931c42379d..94e3011ba7df 100644 --- a/arch/mips/include/asm/mach-bcm63xx/ioremap.h +++ b/arch/mips/include/asm/mach-bcm63xx/ioremap.h @@ -19,6 +19,7 @@ static inline int is_bcm63xx_internal_registers(phys_t offset) return 1; break; case BCM6328_CPU_ID: + case BCM6362_CPU_ID: case BCM6368_CPU_ID: if (offset >= 0xb0000000 && offset < 0xb1000000) return 1; diff --git a/arch/mips/include/asm/mach-generic/dma-coherence.h b/arch/mips/include/asm/mach-generic/dma-coherence.h index 9c95177f7a7e..fe23034aaf72 100644 --- a/arch/mips/include/asm/mach-generic/dma-coherence.h +++ b/arch/mips/include/asm/mach-generic/dma-coherence.h @@ -61,9 +61,8 @@ static inline int plat_device_is_coherent(struct device *dev) { #ifdef CONFIG_DMA_COHERENT return 1; -#endif -#ifdef CONFIG_DMA_NONCOHERENT - return 0; +#else + return coherentio; #endif } diff --git a/arch/mips/include/asm/mach-generic/spaces.h b/arch/mips/include/asm/mach-generic/spaces.h index 73d717a75cb0..5b2f2e68e57f 100644 --- a/arch/mips/include/asm/mach-generic/spaces.h +++ b/arch/mips/include/asm/mach-generic/spaces.h @@ -20,14 +20,21 @@ #endif #ifdef CONFIG_32BIT - +#ifdef CONFIG_KVM_GUEST +#define CAC_BASE _AC(0x40000000, UL) +#else #define CAC_BASE _AC(0x80000000, UL) +#endif #define IO_BASE _AC(0xa0000000, UL) #define UNCAC_BASE _AC(0xa0000000, UL) #ifndef MAP_BASE +#ifdef CONFIG_KVM_GUEST +#define MAP_BASE _AC(0x60000000, UL) +#else #define MAP_BASE _AC(0xc0000000, UL) #endif +#endif /* * Memory above this physical address will be considered highmem. diff --git a/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h b/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h index 75fd8c0f986e..c0f3ef45c2c1 100644 --- a/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h +++ b/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h @@ -57,5 +57,6 @@ #define cpu_has_vint 0 #define cpu_has_vtag_icache 0 #define cpu_has_watch 1 +#define cpu_has_local_ebase 0 #endif /* __ASM_MACH_LOONGSON_CPU_FEATURE_OVERRIDES_H */ diff --git a/arch/mips/include/asm/mach-ralink/mt7620.h b/arch/mips/include/asm/mach-ralink/mt7620.h new file mode 100644 index 000000000000..9809972ea882 --- /dev/null +++ b/arch/mips/include/asm/mach-ralink/mt7620.h @@ -0,0 +1,84 @@ +/* + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * Parts of this file are based on Ralink's 2.6.21 BSP + * + * Copyright (C) 2008-2011 Gabor Juhos + * Copyright (C) 2008 Imre Kaloz + * Copyright (C) 2013 John Crispin + */ + +#ifndef _MT7620_REGS_H_ +#define _MT7620_REGS_H_ + +#define MT7620_SYSC_BASE 0x10000000 + +#define SYSC_REG_CHIP_NAME0 0x00 +#define SYSC_REG_CHIP_NAME1 0x04 +#define SYSC_REG_CHIP_REV 0x0c +#define SYSC_REG_SYSTEM_CONFIG0 0x10 +#define SYSC_REG_SYSTEM_CONFIG1 0x14 +#define SYSC_REG_CPLL_CONFIG0 0x54 +#define SYSC_REG_CPLL_CONFIG1 0x58 + +#define MT7620N_CHIP_NAME0 0x33365452 +#define MT7620N_CHIP_NAME1 0x20203235 + +#define MT7620A_CHIP_NAME0 0x3637544d +#define MT7620A_CHIP_NAME1 0x20203032 + +#define CHIP_REV_PKG_MASK 0x1 +#define CHIP_REV_PKG_SHIFT 16 +#define CHIP_REV_VER_MASK 0xf +#define CHIP_REV_VER_SHIFT 8 +#define CHIP_REV_ECO_MASK 0xf + +#define CPLL_SW_CONFIG_SHIFT 31 +#define CPLL_SW_CONFIG_MASK 0x1 +#define CPLL_CPU_CLK_SHIFT 24 +#define CPLL_CPU_CLK_MASK 0x1 +#define CPLL_MULT_RATIO_SHIFT 16 +#define CPLL_MULT_RATIO 0x7 +#define CPLL_DIV_RATIO_SHIFT 10 +#define CPLL_DIV_RATIO 0x3 + +#define SYSCFG0_DRAM_TYPE_MASK 0x3 +#define SYSCFG0_DRAM_TYPE_SHIFT 4 +#define SYSCFG0_DRAM_TYPE_SDRAM 0 +#define SYSCFG0_DRAM_TYPE_DDR1 1 +#define SYSCFG0_DRAM_TYPE_DDR2 2 + +#define MT7620_DRAM_BASE 0x0 +#define MT7620_SDRAM_SIZE_MIN 2 +#define MT7620_SDRAM_SIZE_MAX 64 +#define MT7620_DDR1_SIZE_MIN 32 +#define MT7620_DDR1_SIZE_MAX 128 +#define MT7620_DDR2_SIZE_MIN 32 +#define MT7620_DDR2_SIZE_MAX 256 + +#define MT7620_GPIO_MODE_I2C BIT(0) +#define MT7620_GPIO_MODE_UART0_SHIFT 2 +#define MT7620_GPIO_MODE_UART0_MASK 0x7 +#define MT7620_GPIO_MODE_UART0(x) ((x) << MT7620_GPIO_MODE_UART0_SHIFT) +#define MT7620_GPIO_MODE_UARTF 0x0 +#define MT7620_GPIO_MODE_PCM_UARTF 0x1 +#define MT7620_GPIO_MODE_PCM_I2S 0x2 +#define MT7620_GPIO_MODE_I2S_UARTF 0x3 +#define MT7620_GPIO_MODE_PCM_GPIO 0x4 +#define MT7620_GPIO_MODE_GPIO_UARTF 0x5 +#define MT7620_GPIO_MODE_GPIO_I2S 0x6 +#define MT7620_GPIO_MODE_GPIO 0x7 +#define MT7620_GPIO_MODE_UART1 BIT(5) +#define MT7620_GPIO_MODE_MDIO BIT(8) +#define MT7620_GPIO_MODE_RGMII1 BIT(9) +#define MT7620_GPIO_MODE_RGMII2 BIT(10) +#define MT7620_GPIO_MODE_SPI BIT(11) +#define MT7620_GPIO_MODE_SPI_REF_CLK BIT(12) +#define MT7620_GPIO_MODE_WLED BIT(13) +#define MT7620_GPIO_MODE_JTAG BIT(15) +#define MT7620_GPIO_MODE_EPHY BIT(15) +#define MT7620_GPIO_MODE_WDT BIT(22) + +#endif diff --git a/arch/mips/include/asm/mach-ralink/rt288x.h b/arch/mips/include/asm/mach-ralink/rt288x.h new file mode 100644 index 000000000000..03ad716acb42 --- /dev/null +++ b/arch/mips/include/asm/mach-ralink/rt288x.h @@ -0,0 +1,53 @@ +/* + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * Parts of this file are based on Ralink's 2.6.21 BSP + * + * Copyright (C) 2008-2011 Gabor Juhos + * Copyright (C) 2008 Imre Kaloz + * Copyright (C) 2013 John Crispin + */ + +#ifndef _RT288X_REGS_H_ +#define _RT288X_REGS_H_ + +#define RT2880_SYSC_BASE 0x00300000 + +#define SYSC_REG_CHIP_NAME0 0x00 +#define SYSC_REG_CHIP_NAME1 0x04 +#define SYSC_REG_CHIP_ID 0x0c +#define SYSC_REG_SYSTEM_CONFIG 0x10 +#define SYSC_REG_CLKCFG 0x30 + +#define RT2880_CHIP_NAME0 0x38325452 +#define RT2880_CHIP_NAME1 0x20203038 + +#define CHIP_ID_ID_MASK 0xff +#define CHIP_ID_ID_SHIFT 8 +#define CHIP_ID_REV_MASK 0xff + +#define SYSTEM_CONFIG_CPUCLK_SHIFT 20 +#define SYSTEM_CONFIG_CPUCLK_MASK 0x3 +#define SYSTEM_CONFIG_CPUCLK_250 0x0 +#define SYSTEM_CONFIG_CPUCLK_266 0x1 +#define SYSTEM_CONFIG_CPUCLK_280 0x2 +#define SYSTEM_CONFIG_CPUCLK_300 0x3 + +#define RT2880_GPIO_MODE_I2C BIT(0) +#define RT2880_GPIO_MODE_UART0 BIT(1) +#define RT2880_GPIO_MODE_SPI BIT(2) +#define RT2880_GPIO_MODE_UART1 BIT(3) +#define RT2880_GPIO_MODE_JTAG BIT(4) +#define RT2880_GPIO_MODE_MDIO BIT(5) +#define RT2880_GPIO_MODE_SDRAM BIT(6) +#define RT2880_GPIO_MODE_PCI BIT(7) + +#define CLKCFG_SRAM_CS_N_WDT BIT(9) + +#define RT2880_SDRAM_BASE 0x08000000 +#define RT2880_MEM_SIZE_MIN 2 +#define RT2880_MEM_SIZE_MAX 128 + +#endif diff --git a/arch/mips/include/asm/mach-ralink/rt288x/cpu-feature-overrides.h b/arch/mips/include/asm/mach-ralink/rt288x/cpu-feature-overrides.h new file mode 100644 index 000000000000..72fc10669199 --- /dev/null +++ b/arch/mips/include/asm/mach-ralink/rt288x/cpu-feature-overrides.h @@ -0,0 +1,56 @@ +/* + * Ralink RT288x specific CPU feature overrides + * + * Copyright (C) 2008-2009 Gabor Juhos + * Copyright (C) 2008 Imre Kaloz + * + * This file was derived from: include/asm-mips/cpu-features.h + * Copyright (C) 2003, 2004 Ralf Baechle + * Copyright (C) 2004 Maciej W. Rozycki + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + */ +#ifndef _RT288X_CPU_FEATURE_OVERRIDES_H +#define _RT288X_CPU_FEATURE_OVERRIDES_H + +#define cpu_has_tlb 1 +#define cpu_has_4kex 1 +#define cpu_has_3k_cache 0 +#define cpu_has_4k_cache 1 +#define cpu_has_tx39_cache 0 +#define cpu_has_sb1_cache 0 +#define cpu_has_fpu 0 +#define cpu_has_32fpr 0 +#define cpu_has_counter 1 +#define cpu_has_watch 1 +#define cpu_has_divec 1 + +#define cpu_has_prefetch 1 +#define cpu_has_ejtag 1 +#define cpu_has_llsc 1 + +#define cpu_has_mips16 1 +#define cpu_has_mdmx 0 +#define cpu_has_mips3d 0 +#define cpu_has_smartmips 0 + +#define cpu_has_mips32r1 1 +#define cpu_has_mips32r2 1 +#define cpu_has_mips64r1 0 +#define cpu_has_mips64r2 0 + +#define cpu_has_dsp 0 +#define cpu_has_mipsmt 0 + +#define cpu_has_64bits 0 +#define cpu_has_64bit_zero_reg 0 +#define cpu_has_64bit_gp_regs 0 +#define cpu_has_64bit_addresses 0 + +#define cpu_dcache_line_size() 16 +#define cpu_icache_line_size() 16 + +#endif /* _RT288X_CPU_FEATURE_OVERRIDES_H */ diff --git a/arch/mips/include/asm/mach-ralink/rt305x.h b/arch/mips/include/asm/mach-ralink/rt305x.h index 7d344f2d7d0a..069bf37a6010 100644 --- a/arch/mips/include/asm/mach-ralink/rt305x.h +++ b/arch/mips/include/asm/mach-ralink/rt305x.h @@ -97,6 +97,14 @@ static inline int soc_is_rt5350(void) #define RT5350_SYSCFG0_CPUCLK_320 0x2 #define RT5350_SYSCFG0_CPUCLK_300 0x3 +#define RT5350_SYSCFG0_DRAM_SIZE_SHIFT 12 +#define RT5350_SYSCFG0_DRAM_SIZE_MASK 7 +#define RT5350_SYSCFG0_DRAM_SIZE_2M 0 +#define RT5350_SYSCFG0_DRAM_SIZE_8M 1 +#define RT5350_SYSCFG0_DRAM_SIZE_16M 2 +#define RT5350_SYSCFG0_DRAM_SIZE_32M 3 +#define RT5350_SYSCFG0_DRAM_SIZE_64M 4 + /* multi function gpio pins */ #define RT305X_GPIO_I2C_SD 1 #define RT305X_GPIO_I2C_SCLK 2 @@ -136,4 +144,23 @@ static inline int soc_is_rt5350(void) #define RT305X_GPIO_MODE_SDRAM BIT(8) #define RT305X_GPIO_MODE_RGMII BIT(9) +#define RT3352_SYSC_REG_SYSCFG0 0x010 +#define RT3352_SYSC_REG_SYSCFG1 0x014 +#define RT3352_SYSC_REG_CLKCFG1 0x030 +#define RT3352_SYSC_REG_RSTCTRL 0x034 +#define RT3352_SYSC_REG_USB_PS 0x05c + +#define RT3352_CLKCFG0_XTAL_SEL BIT(20) +#define RT3352_CLKCFG1_UPHY0_CLK_EN BIT(18) +#define RT3352_CLKCFG1_UPHY1_CLK_EN BIT(20) +#define RT3352_RSTCTRL_UHST BIT(22) +#define RT3352_RSTCTRL_UDEV BIT(25) +#define RT3352_SYSCFG1_USB0_HOST_MODE BIT(10) + +#define RT305X_SDRAM_BASE 0x00000000 +#define RT305X_MEM_SIZE_MIN 2 +#define RT305X_MEM_SIZE_MAX 64 +#define RT3352_MEM_SIZE_MIN 2 +#define RT3352_MEM_SIZE_MAX 256 + #endif diff --git a/arch/mips/include/asm/mach-ralink/rt305x/cpu-feature-overrides.h b/arch/mips/include/asm/mach-ralink/rt305x/cpu-feature-overrides.h new file mode 100644 index 000000000000..917c28654552 --- /dev/null +++ b/arch/mips/include/asm/mach-ralink/rt305x/cpu-feature-overrides.h @@ -0,0 +1,56 @@ +/* + * Ralink RT305x specific CPU feature overrides + * + * Copyright (C) 2008-2009 Gabor Juhos + * Copyright (C) 2008 Imre Kaloz + * + * This file was derived from: include/asm-mips/cpu-features.h + * Copyright (C) 2003, 2004 Ralf Baechle + * Copyright (C) 2004 Maciej W. Rozycki + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + */ +#ifndef _RT305X_CPU_FEATURE_OVERRIDES_H +#define _RT305X_CPU_FEATURE_OVERRIDES_H + +#define cpu_has_tlb 1 +#define cpu_has_4kex 1 +#define cpu_has_3k_cache 0 +#define cpu_has_4k_cache 1 +#define cpu_has_tx39_cache 0 +#define cpu_has_sb1_cache 0 +#define cpu_has_fpu 0 +#define cpu_has_32fpr 0 +#define cpu_has_counter 1 +#define cpu_has_watch 1 +#define cpu_has_divec 1 + +#define cpu_has_prefetch 1 +#define cpu_has_ejtag 1 +#define cpu_has_llsc 1 + +#define cpu_has_mips16 1 +#define cpu_has_mdmx 0 +#define cpu_has_mips3d 0 +#define cpu_has_smartmips 0 + +#define cpu_has_mips32r1 1 +#define cpu_has_mips32r2 1 +#define cpu_has_mips64r1 0 +#define cpu_has_mips64r2 0 + +#define cpu_has_dsp 1 +#define cpu_has_mipsmt 0 + +#define cpu_has_64bits 0 +#define cpu_has_64bit_zero_reg 0 +#define cpu_has_64bit_gp_regs 0 +#define cpu_has_64bit_addresses 0 + +#define cpu_dcache_line_size() 32 +#define cpu_icache_line_size() 32 + +#endif /* _RT305X_CPU_FEATURE_OVERRIDES_H */ diff --git a/arch/mips/include/asm/mach-ralink/rt3883.h b/arch/mips/include/asm/mach-ralink/rt3883.h new file mode 100644 index 000000000000..058382f37f92 --- /dev/null +++ b/arch/mips/include/asm/mach-ralink/rt3883.h @@ -0,0 +1,252 @@ +/* + * Ralink RT3662/RT3883 SoC register definitions + * + * Copyright (C) 2011-2012 Gabor Juhos + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + */ + +#ifndef _RT3883_REGS_H_ +#define _RT3883_REGS_H_ + +#include + +#define RT3883_SDRAM_BASE 0x00000000 +#define RT3883_SYSC_BASE 0x10000000 +#define RT3883_TIMER_BASE 0x10000100 +#define RT3883_INTC_BASE 0x10000200 +#define RT3883_MEMC_BASE 0x10000300 +#define RT3883_UART0_BASE 0x10000500 +#define RT3883_PIO_BASE 0x10000600 +#define RT3883_FSCC_BASE 0x10000700 +#define RT3883_NANDC_BASE 0x10000810 +#define RT3883_I2C_BASE 0x10000900 +#define RT3883_I2S_BASE 0x10000a00 +#define RT3883_SPI_BASE 0x10000b00 +#define RT3883_UART1_BASE 0x10000c00 +#define RT3883_PCM_BASE 0x10002000 +#define RT3883_GDMA_BASE 0x10002800 +#define RT3883_CODEC1_BASE 0x10003000 +#define RT3883_CODEC2_BASE 0x10003800 +#define RT3883_FE_BASE 0x10100000 +#define RT3883_ROM_BASE 0x10118000 +#define RT3883_USBDEV_BASE 0x10112000 +#define RT3883_PCI_BASE 0x10140000 +#define RT3883_WLAN_BASE 0x10180000 +#define RT3883_USBHOST_BASE 0x101c0000 +#define RT3883_BOOT_BASE 0x1c000000 +#define RT3883_SRAM_BASE 0x1e000000 +#define RT3883_PCIMEM_BASE 0x20000000 + +#define RT3883_EHCI_BASE (RT3883_USBHOST_BASE) +#define RT3883_OHCI_BASE (RT3883_USBHOST_BASE + 0x1000) + +#define RT3883_SYSC_SIZE 0x100 +#define RT3883_TIMER_SIZE 0x100 +#define RT3883_INTC_SIZE 0x100 +#define RT3883_MEMC_SIZE 0x100 +#define RT3883_UART0_SIZE 0x100 +#define RT3883_UART1_SIZE 0x100 +#define RT3883_PIO_SIZE 0x100 +#define RT3883_FSCC_SIZE 0x100 +#define RT3883_NANDC_SIZE 0x0f0 +#define RT3883_I2C_SIZE 0x100 +#define RT3883_I2S_SIZE 0x100 +#define RT3883_SPI_SIZE 0x100 +#define RT3883_PCM_SIZE 0x800 +#define RT3883_GDMA_SIZE 0x800 +#define RT3883_CODEC1_SIZE 0x800 +#define RT3883_CODEC2_SIZE 0x800 +#define RT3883_FE_SIZE 0x10000 +#define RT3883_ROM_SIZE 0x4000 +#define RT3883_USBDEV_SIZE 0x4000 +#define RT3883_PCI_SIZE 0x40000 +#define RT3883_WLAN_SIZE 0x40000 +#define RT3883_USBHOST_SIZE 0x40000 +#define RT3883_BOOT_SIZE (32 * 1024 * 1024) +#define RT3883_SRAM_SIZE (32 * 1024 * 1024) + +/* SYSC registers */ +#define RT3883_SYSC_REG_CHIPID0_3 0x00 /* Chip ID 0 */ +#define RT3883_SYSC_REG_CHIPID4_7 0x04 /* Chip ID 1 */ +#define RT3883_SYSC_REG_REVID 0x0c /* Chip Revision Identification */ +#define RT3883_SYSC_REG_SYSCFG0 0x10 /* System Configuration 0 */ +#define RT3883_SYSC_REG_SYSCFG1 0x14 /* System Configuration 1 */ +#define RT3883_SYSC_REG_CLKCFG0 0x2c /* Clock Configuration 0 */ +#define RT3883_SYSC_REG_CLKCFG1 0x30 /* Clock Configuration 1 */ +#define RT3883_SYSC_REG_RSTCTRL 0x34 /* Reset Control*/ +#define RT3883_SYSC_REG_RSTSTAT 0x38 /* Reset Status*/ +#define RT3883_SYSC_REG_USB_PS 0x5c /* USB Power saving control */ +#define RT3883_SYSC_REG_GPIO_MODE 0x60 /* GPIO Purpose Select */ +#define RT3883_SYSC_REG_PCIE_CLK_GEN0 0x7c +#define RT3883_SYSC_REG_PCIE_CLK_GEN1 0x80 +#define RT3883_SYSC_REG_PCIE_CLK_GEN2 0x84 +#define RT3883_SYSC_REG_PMU 0x88 +#define RT3883_SYSC_REG_PMU1 0x8c + +#define RT3883_CHIP_NAME0 0x38335452 +#define RT3883_CHIP_NAME1 0x20203338 + +#define RT3883_REVID_VER_ID_MASK 0x0f +#define RT3883_REVID_VER_ID_SHIFT 8 +#define RT3883_REVID_ECO_ID_MASK 0x0f + +#define RT3883_SYSCFG0_DRAM_TYPE_DDR2 BIT(17) +#define RT3883_SYSCFG0_CPUCLK_SHIFT 8 +#define RT3883_SYSCFG0_CPUCLK_MASK 0x3 +#define RT3883_SYSCFG0_CPUCLK_250 0x0 +#define RT3883_SYSCFG0_CPUCLK_384 0x1 +#define RT3883_SYSCFG0_CPUCLK_480 0x2 +#define RT3883_SYSCFG0_CPUCLK_500 0x3 + +#define RT3883_SYSCFG1_USB0_HOST_MODE BIT(10) +#define RT3883_SYSCFG1_PCIE_RC_MODE BIT(8) +#define RT3883_SYSCFG1_PCI_HOST_MODE BIT(7) +#define RT3883_SYSCFG1_PCI_66M_MODE BIT(6) +#define RT3883_SYSCFG1_GPIO2_AS_WDT_OUT BIT(2) + +#define RT3883_CLKCFG1_PCIE_CLK_EN BIT(21) +#define RT3883_CLKCFG1_UPHY1_CLK_EN BIT(20) +#define RT3883_CLKCFG1_PCI_CLK_EN BIT(19) +#define RT3883_CLKCFG1_UPHY0_CLK_EN BIT(18) + +#define RT3883_GPIO_MODE_I2C BIT(0) +#define RT3883_GPIO_MODE_SPI BIT(1) +#define RT3883_GPIO_MODE_UART0_SHIFT 2 +#define RT3883_GPIO_MODE_UART0_MASK 0x7 +#define RT3883_GPIO_MODE_UART0(x) ((x) << RT3883_GPIO_MODE_UART0_SHIFT) +#define RT3883_GPIO_MODE_UARTF 0x0 +#define RT3883_GPIO_MODE_PCM_UARTF 0x1 +#define RT3883_GPIO_MODE_PCM_I2S 0x2 +#define RT3883_GPIO_MODE_I2S_UARTF 0x3 +#define RT3883_GPIO_MODE_PCM_GPIO 0x4 +#define RT3883_GPIO_MODE_GPIO_UARTF 0x5 +#define RT3883_GPIO_MODE_GPIO_I2S 0x6 +#define RT3883_GPIO_MODE_GPIO 0x7 +#define RT3883_GPIO_MODE_UART1 BIT(5) +#define RT3883_GPIO_MODE_JTAG BIT(6) +#define RT3883_GPIO_MODE_MDIO BIT(7) +#define RT3883_GPIO_MODE_GE1 BIT(9) +#define RT3883_GPIO_MODE_GE2 BIT(10) +#define RT3883_GPIO_MODE_PCI_SHIFT 11 +#define RT3883_GPIO_MODE_PCI_MASK 0x7 +#define RT3883_GPIO_MODE_PCI (RT3883_GPIO_MODE_PCI_MASK << RT3883_GPIO_MODE_PCI_SHIFT) +#define RT3883_GPIO_MODE_LNA_A_SHIFT 16 +#define RT3883_GPIO_MODE_LNA_A_MASK 0x3 +#define _RT3883_GPIO_MODE_LNA_A(_x) ((_x) << RT3883_GPIO_MODE_LNA_A_SHIFT) +#define RT3883_GPIO_MODE_LNA_A_GPIO 0x3 +#define RT3883_GPIO_MODE_LNA_A _RT3883_GPIO_MODE_LNA_A(RT3883_GPIO_MODE_LNA_A_MASK) +#define RT3883_GPIO_MODE_LNA_G_SHIFT 18 +#define RT3883_GPIO_MODE_LNA_G_MASK 0x3 +#define _RT3883_GPIO_MODE_LNA_G(_x) ((_x) << RT3883_GPIO_MODE_LNA_G_SHIFT) +#define RT3883_GPIO_MODE_LNA_G_GPIO 0x3 +#define RT3883_GPIO_MODE_LNA_G _RT3883_GPIO_MODE_LNA_G(RT3883_GPIO_MODE_LNA_G_MASK) + +#define RT3883_GPIO_I2C_SD 1 +#define RT3883_GPIO_I2C_SCLK 2 +#define RT3883_GPIO_SPI_CS0 3 +#define RT3883_GPIO_SPI_CLK 4 +#define RT3883_GPIO_SPI_MOSI 5 +#define RT3883_GPIO_SPI_MISO 6 +#define RT3883_GPIO_7 7 +#define RT3883_GPIO_10 10 +#define RT3883_GPIO_11 11 +#define RT3883_GPIO_14 14 +#define RT3883_GPIO_UART1_TXD 15 +#define RT3883_GPIO_UART1_RXD 16 +#define RT3883_GPIO_JTAG_TDO 17 +#define RT3883_GPIO_JTAG_TDI 18 +#define RT3883_GPIO_JTAG_TMS 19 +#define RT3883_GPIO_JTAG_TCLK 20 +#define RT3883_GPIO_JTAG_TRST_N 21 +#define RT3883_GPIO_MDIO_MDC 22 +#define RT3883_GPIO_MDIO_MDIO 23 +#define RT3883_GPIO_LNA_PE_A0 32 +#define RT3883_GPIO_LNA_PE_A1 33 +#define RT3883_GPIO_LNA_PE_A2 34 +#define RT3883_GPIO_LNA_PE_G0 35 +#define RT3883_GPIO_LNA_PE_G1 36 +#define RT3883_GPIO_LNA_PE_G2 37 +#define RT3883_GPIO_PCI_AD0 40 +#define RT3883_GPIO_PCI_AD31 71 +#define RT3883_GPIO_GE2_TXD0 72 +#define RT3883_GPIO_GE2_TXD1 73 +#define RT3883_GPIO_GE2_TXD2 74 +#define RT3883_GPIO_GE2_TXD3 75 +#define RT3883_GPIO_GE2_TXEN 76 +#define RT3883_GPIO_GE2_TXCLK 77 +#define RT3883_GPIO_GE2_RXD0 78 +#define RT3883_GPIO_GE2_RXD1 79 +#define RT3883_GPIO_GE2_RXD2 80 +#define RT3883_GPIO_GE2_RXD3 81 +#define RT3883_GPIO_GE2_RXDV 82 +#define RT3883_GPIO_GE2_RXCLK 83 +#define RT3883_GPIO_GE1_TXD0 84 +#define RT3883_GPIO_GE1_TXD1 85 +#define RT3883_GPIO_GE1_TXD2 86 +#define RT3883_GPIO_GE1_TXD3 87 +#define RT3883_GPIO_GE1_TXEN 88 +#define RT3883_GPIO_GE1_TXCLK 89 +#define RT3883_GPIO_GE1_RXD0 90 +#define RT3883_GPIO_GE1_RXD1 91 +#define RT3883_GPIO_GE1_RXD2 92 +#define RT3883_GPIO_GE1_RXD3 93 +#define RT3883_GPIO_GE1_RXDV 94 +#define RT3883_GPIO_GE1_RXCLK 95 + +#define RT3883_RSTCTRL_PCIE_PCI_PDM BIT(27) +#define RT3883_RSTCTRL_FLASH BIT(26) +#define RT3883_RSTCTRL_UDEV BIT(25) +#define RT3883_RSTCTRL_PCI BIT(24) +#define RT3883_RSTCTRL_PCIE BIT(23) +#define RT3883_RSTCTRL_UHST BIT(22) +#define RT3883_RSTCTRL_FE BIT(21) +#define RT3883_RSTCTRL_WLAN BIT(20) +#define RT3883_RSTCTRL_UART1 BIT(29) +#define RT3883_RSTCTRL_SPI BIT(18) +#define RT3883_RSTCTRL_I2S BIT(17) +#define RT3883_RSTCTRL_I2C BIT(16) +#define RT3883_RSTCTRL_NAND BIT(15) +#define RT3883_RSTCTRL_DMA BIT(14) +#define RT3883_RSTCTRL_PIO BIT(13) +#define RT3883_RSTCTRL_UART BIT(12) +#define RT3883_RSTCTRL_PCM BIT(11) +#define RT3883_RSTCTRL_MC BIT(10) +#define RT3883_RSTCTRL_INTC BIT(9) +#define RT3883_RSTCTRL_TIMER BIT(8) +#define RT3883_RSTCTRL_SYS BIT(0) + +#define RT3883_INTC_INT_SYSCTL BIT(0) +#define RT3883_INTC_INT_TIMER0 BIT(1) +#define RT3883_INTC_INT_TIMER1 BIT(2) +#define RT3883_INTC_INT_IA BIT(3) +#define RT3883_INTC_INT_PCM BIT(4) +#define RT3883_INTC_INT_UART0 BIT(5) +#define RT3883_INTC_INT_PIO BIT(6) +#define RT3883_INTC_INT_DMA BIT(7) +#define RT3883_INTC_INT_NAND BIT(8) +#define RT3883_INTC_INT_PERFC BIT(9) +#define RT3883_INTC_INT_I2S BIT(10) +#define RT3883_INTC_INT_UART1 BIT(12) +#define RT3883_INTC_INT_UHST BIT(18) +#define RT3883_INTC_INT_UDEV BIT(19) + +/* FLASH/SRAM/Codec Controller registers */ +#define RT3883_FSCC_REG_FLASH_CFG0 0x00 +#define RT3883_FSCC_REG_FLASH_CFG1 0x04 +#define RT3883_FSCC_REG_CODEC_CFG0 0x40 +#define RT3883_FSCC_REG_CODEC_CFG1 0x44 + +#define RT3883_FLASH_CFG_WIDTH_SHIFT 26 +#define RT3883_FLASH_CFG_WIDTH_MASK 0x3 +#define RT3883_FLASH_CFG_WIDTH_8BIT 0x0 +#define RT3883_FLASH_CFG_WIDTH_16BIT 0x1 +#define RT3883_FLASH_CFG_WIDTH_32BIT 0x2 + +#define RT3883_SDRAM_BASE 0x00000000 +#define RT3883_MEM_SIZE_MIN 2 +#define RT3883_MEM_SIZE_MAX 256 + +#endif /* _RT3883_REGS_H_ */ diff --git a/arch/mips/include/asm/mach-ralink/rt3883/cpu-feature-overrides.h b/arch/mips/include/asm/mach-ralink/rt3883/cpu-feature-overrides.h new file mode 100644 index 000000000000..181fbf4c976f --- /dev/null +++ b/arch/mips/include/asm/mach-ralink/rt3883/cpu-feature-overrides.h @@ -0,0 +1,55 @@ +/* + * Ralink RT3662/RT3883 specific CPU feature overrides + * + * Copyright (C) 2011-2013 Gabor Juhos + * + * This file was derived from: include/asm-mips/cpu-features.h + * Copyright (C) 2003, 2004 Ralf Baechle + * Copyright (C) 2004 Maciej W. Rozycki + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + */ +#ifndef _RT3883_CPU_FEATURE_OVERRIDES_H +#define _RT3883_CPU_FEATURE_OVERRIDES_H + +#define cpu_has_tlb 1 +#define cpu_has_4kex 1 +#define cpu_has_3k_cache 0 +#define cpu_has_4k_cache 1 +#define cpu_has_tx39_cache 0 +#define cpu_has_sb1_cache 0 +#define cpu_has_fpu 0 +#define cpu_has_32fpr 0 +#define cpu_has_counter 1 +#define cpu_has_watch 1 +#define cpu_has_divec 1 + +#define cpu_has_prefetch 1 +#define cpu_has_ejtag 1 +#define cpu_has_llsc 1 + +#define cpu_has_mips16 1 +#define cpu_has_mdmx 0 +#define cpu_has_mips3d 0 +#define cpu_has_smartmips 0 + +#define cpu_has_mips32r1 1 +#define cpu_has_mips32r2 1 +#define cpu_has_mips64r1 0 +#define cpu_has_mips64r2 0 + +#define cpu_has_dsp 1 +#define cpu_has_mipsmt 0 + +#define cpu_has_64bits 0 +#define cpu_has_64bit_zero_reg 0 +#define cpu_has_64bit_gp_regs 0 +#define cpu_has_64bit_addresses 0 + +#define cpu_dcache_line_size() 32 +#define cpu_icache_line_size() 32 + +#endif /* _RT3883_CPU_FEATURE_OVERRIDES_H */ diff --git a/arch/mips/include/asm/mach-sead3/cpu-feature-overrides.h b/arch/mips/include/asm/mach-sead3/cpu-feature-overrides.h index 193c0912d38e..bfbd7035d4c5 100644 --- a/arch/mips/include/asm/mach-sead3/cpu-feature-overrides.h +++ b/arch/mips/include/asm/mach-sead3/cpu-feature-overrides.h @@ -28,7 +28,11 @@ /* #define cpu_has_prefetch ? */ #define cpu_has_mcheck 1 /* #define cpu_has_ejtag ? */ +#ifdef CONFIG_CPU_MICROMIPS +#define cpu_has_llsc 0 +#else #define cpu_has_llsc 1 +#endif /* #define cpu_has_vtag_icache ? */ /* #define cpu_has_dc_aliases ? */ /* #define cpu_has_ic_fills_f_dc ? */ diff --git a/arch/mips/include/asm/mips-boards/generic.h b/arch/mips/include/asm/mips-boards/generic.h index 44a09a64160a..bd9746fbe4af 100644 --- a/arch/mips/include/asm/mips-boards/generic.h +++ b/arch/mips/include/asm/mips-boards/generic.h @@ -83,4 +83,7 @@ extern void mips_pcibios_init(void); #define mips_pcibios_init() do { } while (0) #endif +extern void mips_scroll_message(void); +extern void mips_display_message(const char *str); + #endif /* __ASM_MIPS_BOARDS_GENERIC_H */ diff --git a/arch/mips/include/asm/mips-boards/prom.h b/arch/mips/include/asm/mips-boards/prom.h deleted file mode 100644 index e7aed3e4ff58..000000000000 --- a/arch/mips/include/asm/mips-boards/prom.h +++ /dev/null @@ -1,47 +0,0 @@ -/* - * Carsten Langgaard, carstenl@mips.com - * Copyright (C) 2000 MIPS Technologies, Inc. All rights reserved. - * - * ######################################################################## - * - * This program is free software; you can distribute it and/or modify it - * under the terms of the GNU General Public License (Version 2) as - * published by the Free Software Foundation. - * - * This program is distributed in the hope it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * for more details. - * - * You should have received a copy of the GNU General Public License along - * with this program; if not, write to the Free Software Foundation, Inc., - * 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. - * - * ######################################################################## - * - * MIPS boards bootprom interface for the Linux kernel. - * - */ - -#ifndef _MIPS_PROM_H -#define _MIPS_PROM_H - -extern char *prom_getcmdline(void); -extern char *prom_getenv(char *name); -extern void prom_init_cmdline(void); -extern void prom_meminit(void); -extern void prom_fixup_mem_map(unsigned long start_mem, unsigned long end_mem); -extern void mips_display_message(const char *str); -extern void mips_display_word(unsigned int num); -extern void mips_scroll_message(void); -extern int get_ethernet_addr(char *ethernet_addr); - -/* Memory descriptor management. */ -#define PROM_MAX_PMEMBLOCKS 32 -struct prom_pmemblock { - unsigned long base; /* Within KSEG0. */ - unsigned int size; /* In bytes. */ - unsigned int type; /* free or prom memory */ -}; - -#endif /* !(_MIPS_PROM_H) */ diff --git a/arch/mips/include/asm/mips_machine.h b/arch/mips/include/asm/mips_machine.h index 363bb352c7f7..9d00aebe9842 100644 --- a/arch/mips/include/asm/mips_machine.h +++ b/arch/mips/include/asm/mips_machine.h @@ -42,13 +42,9 @@ extern long __mips_machines_end; #ifdef CONFIG_MIPS_MACHINE int mips_machtype_setup(char *id) __init; void mips_machine_setup(void) __init; -void mips_set_machine_name(const char *name) __init; -char *mips_get_machine_name(void); #else static inline int mips_machtype_setup(char *id) { return 1; } static inline void mips_machine_setup(void) { } -static inline void mips_set_machine_name(const char *name) { } -static inline char *mips_get_machine_name(void) { return NULL; } #endif /* CONFIG_MIPS_MACHINE */ #endif /* __ASM_MIPS_MACHINE_H */ diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h index 0da44d422f5b..87e6207b05e4 100644 --- a/arch/mips/include/asm/mipsregs.h +++ b/arch/mips/include/asm/mipsregs.h @@ -596,6 +596,7 @@ #define MIPS_CONF3_RXI (_ULCAST_(1) << 12) #define MIPS_CONF3_ULRI (_ULCAST_(1) << 13) #define MIPS_CONF3_ISA (_ULCAST_(3) << 14) +#define MIPS_CONF3_ISA_OE (_ULCAST_(3) << 16) #define MIPS_CONF3_VZ (_ULCAST_(1) << 23) #define MIPS_CONF4_MMUSIZEEXT (_ULCAST_(255) << 0) @@ -622,6 +623,24 @@ #ifndef __ASSEMBLY__ +/* + * Macros for handling the ISA mode bit for microMIPS. + */ +#define get_isa16_mode(x) ((x) & 0x1) +#define msk_isa16_mode(x) ((x) & ~0x1) +#define set_isa16_mode(x) do { (x) |= 0x1; } while(0) + +/* + * microMIPS instructions can be 16-bit or 32-bit in length. This + * returns a 1 if the instruction is 16-bit and a 0 if 32-bit. + */ +static inline int mm_insn_16bit(u16 insn) +{ + u16 opcode = (insn >> 10) & 0x7; + + return (opcode >= 1 && opcode <= 3) ? 1 : 0; +} + /* * Functions to access the R10000 performance counters. These are basically * mfc0 and mtc0 instructions from and to coprocessor register with a 5-bit diff --git a/arch/mips/include/asm/mmu_context.h b/arch/mips/include/asm/mmu_context.h index e81d719efcd1..1554721e4808 100644 --- a/arch/mips/include/asm/mmu_context.h +++ b/arch/mips/include/asm/mmu_context.h @@ -26,10 +26,15 @@ #ifdef CONFIG_MIPS_PGD_C0_CONTEXT -#define TLBMISS_HANDLER_SETUP_PGD(pgd) \ - tlbmiss_handler_setup_pgd((unsigned long)(pgd)) - -extern void tlbmiss_handler_setup_pgd(unsigned long pgd); +#define TLBMISS_HANDLER_SETUP_PGD(pgd) \ +do { \ + void (*tlbmiss_handler_setup_pgd)(unsigned long); \ + extern u32 tlbmiss_handler_setup_pgd_array[16]; \ + \ + tlbmiss_handler_setup_pgd = \ + (__typeof__(tlbmiss_handler_setup_pgd)) tlbmiss_handler_setup_pgd_array; \ + tlbmiss_handler_setup_pgd((unsigned long)(pgd)); \ +} while (0) #define TLBMISS_HANDLER_SETUP() \ do { \ @@ -62,59 +67,88 @@ extern unsigned long pgd_current[]; TLBMISS_HANDLER_SETUP_PGD(swapper_pg_dir) #endif #endif /* CONFIG_MIPS_PGD_C0_CONTEXT*/ -#if defined(CONFIG_CPU_R3000) || defined(CONFIG_CPU_TX39XX) - -#define ASID_INC 0x40 -#define ASID_MASK 0xfc0 - -#elif defined(CONFIG_CPU_R8000) - -#define ASID_INC 0x10 -#define ASID_MASK 0xff0 -#elif defined(CONFIG_MIPS_MT_SMTC) - -#define ASID_INC 0x1 -extern unsigned long smtc_asid_mask; -#define ASID_MASK (smtc_asid_mask) -#define HW_ASID_MASK 0xff -/* End SMTC/34K debug hack */ -#else /* FIXME: not correct for R6000 */ - -#define ASID_INC 0x1 -#define ASID_MASK 0xff +#define ASID_INC(asid) \ +({ \ + unsigned long __asid = asid; \ + __asm__("1:\taddiu\t%0,1\t\t\t\t# patched\n\t" \ + ".section\t__asid_inc,\"a\"\n\t" \ + ".word\t1b\n\t" \ + ".previous" \ + :"=r" (__asid) \ + :"0" (__asid)); \ + __asid; \ +}) +#define ASID_MASK(asid) \ +({ \ + unsigned long __asid = asid; \ + __asm__("1:\tandi\t%0,%1,0xfc0\t\t\t# patched\n\t" \ + ".section\t__asid_mask,\"a\"\n\t" \ + ".word\t1b\n\t" \ + ".previous" \ + :"=r" (__asid) \ + :"r" (__asid)); \ + __asid; \ +}) +#define ASID_VERSION_MASK \ +({ \ + unsigned long __asid; \ + __asm__("1:\taddiu\t%0,$0,0xff00\t\t\t\t# patched\n\t" \ + ".section\t__asid_version_mask,\"a\"\n\t" \ + ".word\t1b\n\t" \ + ".previous" \ + :"=r" (__asid)); \ + __asid; \ +}) +#define ASID_FIRST_VERSION \ +({ \ + unsigned long __asid = asid; \ + __asm__("1:\tli\t%0,0x100\t\t\t\t# patched\n\t" \ + ".section\t__asid_first_version,\"a\"\n\t" \ + ".word\t1b\n\t" \ + ".previous" \ + :"=r" (__asid)); \ + __asid; \ +}) + +#define ASID_FIRST_VERSION_R3000 0x1000 +#define ASID_FIRST_VERSION_R4000 0x100 +#define ASID_FIRST_VERSION_R8000 0x1000 +#define ASID_FIRST_VERSION_RM9000 0x1000 +#ifdef CONFIG_MIPS_MT_SMTC +#define SMTC_HW_ASID_MASK 0xff +extern unsigned int smtc_asid_mask; #endif #define cpu_context(cpu, mm) ((mm)->context.asid[cpu]) -#define cpu_asid(cpu, mm) (cpu_context((cpu), (mm)) & ASID_MASK) +#define cpu_asid(cpu, mm) ASID_MASK(cpu_context((cpu), (mm))) #define asid_cache(cpu) (cpu_data[cpu].asid_cache) static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) { } -/* - * All unused by hardware upper bits will be considered - * as a software asid extension. - */ -#define ASID_VERSION_MASK ((unsigned long)~(ASID_MASK|(ASID_MASK-1))) -#define ASID_FIRST_VERSION ((unsigned long)(~ASID_VERSION_MASK) + 1) - #ifndef CONFIG_MIPS_MT_SMTC /* Normal, classic MIPS get_new_mmu_context */ static inline void get_new_mmu_context(struct mm_struct *mm, unsigned long cpu) { + extern void kvm_local_flush_tlb_all(void); unsigned long asid = asid_cache(cpu); - if (! ((asid += ASID_INC) & ASID_MASK) ) { + if (!ASID_MASK((asid = ASID_INC(asid)))) { if (cpu_has_vtag_icache) flush_icache_all(); +#ifdef CONFIG_VIRTUALIZATION + kvm_local_flush_tlb_all(); /* start new asid cycle */ +#else local_flush_tlb_all(); /* start new asid cycle */ +#endif if (!asid) /* fix version if needed */ asid = ASID_FIRST_VERSION; } + cpu_context(cpu, mm) = asid_cache(cpu) = asid; } @@ -133,7 +167,7 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm) { int i; - for_each_online_cpu(i) + for_each_possible_cpu(i) cpu_context(i, mm) = 0; return 0; @@ -166,7 +200,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, * free up the ASID value for use and flush any old * instances of it from the TLB. */ - oldasid = (read_c0_entryhi() & ASID_MASK); + oldasid = ASID_MASK(read_c0_entryhi()); if(smtc_live_asid[mytlb][oldasid]) { smtc_live_asid[mytlb][oldasid] &= ~(0x1 << cpu); if(smtc_live_asid[mytlb][oldasid] == 0) @@ -177,7 +211,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, * having ASID_MASK smaller than the hardware maximum, * make sure no "soft" bits become "hard"... */ - write_c0_entryhi((read_c0_entryhi() & ~HW_ASID_MASK) | + write_c0_entryhi((read_c0_entryhi() & ~SMTC_HW_ASID_MASK) | cpu_asid(cpu, next)); ehb(); /* Make sure it propagates to TCStatus */ evpe(mtflags); @@ -230,15 +264,15 @@ activate_mm(struct mm_struct *prev, struct mm_struct *next) #ifdef CONFIG_MIPS_MT_SMTC /* See comments for similar code above */ mtflags = dvpe(); - oldasid = read_c0_entryhi() & ASID_MASK; + oldasid = ASID_MASK(read_c0_entryhi()); if(smtc_live_asid[mytlb][oldasid]) { smtc_live_asid[mytlb][oldasid] &= ~(0x1 << cpu); if(smtc_live_asid[mytlb][oldasid] == 0) smtc_flush_tlb_asid(oldasid); } /* See comments for similar code above */ - write_c0_entryhi((read_c0_entryhi() & ~HW_ASID_MASK) | - cpu_asid(cpu, next)); + write_c0_entryhi((read_c0_entryhi() & ~SMTC_HW_ASID_MASK) | + cpu_asid(cpu, next)); ehb(); /* Make sure it propagates to TCStatus */ evpe(mtflags); #else @@ -275,14 +309,14 @@ drop_mmu_context(struct mm_struct *mm, unsigned cpu) #ifdef CONFIG_MIPS_MT_SMTC /* See comments for similar code above */ prevvpe = dvpe(); - oldasid = (read_c0_entryhi() & ASID_MASK); + oldasid = ASID_MASK(read_c0_entryhi()); if (smtc_live_asid[mytlb][oldasid]) { smtc_live_asid[mytlb][oldasid] &= ~(0x1 << cpu); if(smtc_live_asid[mytlb][oldasid] == 0) smtc_flush_tlb_asid(oldasid); } /* See comments for similar code above */ - write_c0_entryhi((read_c0_entryhi() & ~HW_ASID_MASK) + write_c0_entryhi((read_c0_entryhi() & ~SMTC_HW_ASID_MASK) | cpu_asid(cpu, mm)); ehb(); /* Make sure it propagates to TCStatus */ evpe(prevvpe); diff --git a/arch/mips/include/asm/netlogic/haldefs.h b/arch/mips/include/asm/netlogic/haldefs.h index 419d8aef8569..79c7cccdc22c 100644 --- a/arch/mips/include/asm/netlogic/haldefs.h +++ b/arch/mips/include/asm/netlogic/haldefs.h @@ -35,42 +35,13 @@ #ifndef __NLM_HAL_HALDEFS_H__ #define __NLM_HAL_HALDEFS_H__ +#include /* for local_irq_disable */ + /* * This file contains platform specific memory mapped IO implementation * and will provide a way to read 32/64 bit memory mapped registers in * all ABIs */ -#if !defined(CONFIG_64BIT) && defined(CONFIG_CPU_XLP) -#error "o32 compile not supported on XLP yet" -#endif -/* - * For o32 compilation, we have to disable interrupts and enable KX bit to - * access 64 bit addresses or data. - * - * We need to disable interrupts because we save just the lower 32 bits of - * registers in interrupt handling. So if we get hit by an interrupt while - * using the upper 32 bits of a register, we lose. - */ -static inline uint32_t nlm_save_flags_kx(void) -{ - return change_c0_status(ST0_KX | ST0_IE, ST0_KX); -} - -static inline uint32_t nlm_save_flags_cop2(void) -{ - return change_c0_status(ST0_CU2 | ST0_IE, ST0_CU2); -} - -static inline void nlm_restore_flags(uint32_t sr) -{ - write_c0_status(sr); -} - -/* - * The n64 implementations are simple, the o32 implementations when they - * are added, will have to disable interrupts and enable KX before doing - * 64 bit ops. - */ static inline uint32_t nlm_read_reg(uint64_t base, uint32_t reg) { @@ -87,13 +58,40 @@ nlm_write_reg(uint64_t base, uint32_t reg, uint32_t val) *addr = val; } +/* + * For o32 compilation, we have to disable interrupts to access 64 bit + * registers + * + * We need to disable interrupts because we save just the lower 32 bits of + * registers in interrupt handling. So if we get hit by an interrupt while + * using the upper 32 bits of a register, we lose. + */ + static inline uint64_t nlm_read_reg64(uint64_t base, uint32_t reg) { uint64_t addr = base + (reg >> 1) * sizeof(uint64_t); volatile uint64_t *ptr = (volatile uint64_t *)(long)addr; - - return *ptr; + uint64_t val; + + if (sizeof(unsigned long) == 4) { + unsigned long flags; + + local_irq_save(flags); + __asm__ __volatile__( + ".set push" "\n\t" + ".set mips64" "\n\t" + "ld %L0, %1" "\n\t" + "dsra32 %M0, %L0, 0" "\n\t" + "sll %L0, %L0, 0" "\n\t" + ".set pop" "\n" + : "=r" (val) + : "m" (*ptr)); + local_irq_restore(flags); + } else + val = *ptr; + + return val; } static inline void @@ -102,7 +100,25 @@ nlm_write_reg64(uint64_t base, uint32_t reg, uint64_t val) uint64_t addr = base + (reg >> 1) * sizeof(uint64_t); volatile uint64_t *ptr = (volatile uint64_t *)(long)addr; - *ptr = val; + if (sizeof(unsigned long) == 4) { + unsigned long flags; + uint64_t tmp; + + local_irq_save(flags); + __asm__ __volatile__( + ".set push" "\n\t" + ".set mips64" "\n\t" + "dsll32 %L0, %L0, 0" "\n\t" + "dsrl32 %L0, %L0, 0" "\n\t" + "dsll32 %M0, %M0, 0" "\n\t" + "or %L0, %L0, %M0" "\n\t" + "sd %L0, %2" "\n\t" + ".set pop" "\n" + : "=r" (tmp) + : "0" (val), "m" (*ptr)); + local_irq_restore(flags); + } else + *ptr = val; } /* @@ -143,14 +159,6 @@ nlm_pcicfg_base(uint32_t devoffset) return nlm_io_base + devoffset; } -static inline uint64_t -nlm_xkphys_map_pcibar0(uint64_t pcibase) -{ - uint64_t paddr; - - paddr = nlm_read_reg(pcibase, 0x4) & ~0xfu; - return (uint64_t)0x9000000000000000 | paddr; -} #elif defined(CONFIG_CPU_XLR) static inline uint64_t diff --git a/arch/mips/include/asm/netlogic/mips-extns.h b/arch/mips/include/asm/netlogic/mips-extns.h index 8ad2e0f81719..f299d31d7c1a 100644 --- a/arch/mips/include/asm/netlogic/mips-extns.h +++ b/arch/mips/include/asm/netlogic/mips-extns.h @@ -38,21 +38,16 @@ /* * XLR and XLP interrupt request and interrupt mask registers */ -#define read_c0_eirr() __read_64bit_c0_register($9, 6) -#define read_c0_eimr() __read_64bit_c0_register($9, 7) -#define write_c0_eirr(val) __write_64bit_c0_register($9, 6, val) - /* - * Writing EIMR in 32 bit is a special case, the lower 8 bit of the - * EIMR is shadowed in the status register, so we cannot save and - * restore status register for split read. + * NOTE: Do not save/restore flags around write_c0_eimr(). + * On non-R2 platforms the flags has part of EIMR that is shadowed in STATUS + * register. Restoring flags will overwrite the lower 8 bits of EIMR. + * + * Call with interrupts disabled. */ #define write_c0_eimr(val) \ do { \ if (sizeof(unsigned long) == 4) { \ - unsigned long __flags; \ - \ - local_irq_save(__flags); \ __asm__ __volatile__( \ ".set\tmips64\n\t" \ "dsll\t%L0, %L0, 32\n\t" \ @@ -62,8 +57,6 @@ do { \ "dmtc0\t%L0, $9, 7\n\t" \ ".set\tmips0" \ : : "r" (val)); \ - __flags = (__flags & 0xffff00ff) | (((val) & 0xff) << 8);\ - local_irq_restore(__flags); \ } else \ __write_64bit_c0_register($9, 7, (val)); \ } while (0) @@ -128,7 +121,7 @@ static inline uint64_t read_c0_eirr_and_eimr(void) uint64_t val; #ifdef CONFIG_64BIT - val = read_c0_eimr() & read_c0_eirr(); + val = __read_64bit_c0_register($9, 6) & __read_64bit_c0_register($9, 7); #else __asm__ __volatile__( ".set push\n\t" @@ -143,7 +136,6 @@ static inline uint64_t read_c0_eirr_and_eimr(void) ".set pop" : "=r" (val)); #endif - return val; } diff --git a/arch/mips/include/asm/netlogic/xlp-hal/pic.h b/arch/mips/include/asm/netlogic/xlp-hal/pic.h index 3df53017fe51..a981f4681a15 100644 --- a/arch/mips/include/asm/netlogic/xlp-hal/pic.h +++ b/arch/mips/include/asm/netlogic/xlp-hal/pic.h @@ -191,59 +191,6 @@ #define PIC_IRT_PCIE_LINK_2_INDEX 80 #define PIC_IRT_PCIE_LINK_3_INDEX 81 #define PIC_IRT_PCIE_LINK_INDEX(num) ((num) + PIC_IRT_PCIE_LINK_0_INDEX) -/* 78 to 81 */ -#define PIC_NUM_NA_IRTS 32 -/* 82 to 113 */ -#define PIC_IRT_NA_0_INDEX 82 -#define PIC_IRT_NA_INDEX(num) ((num) + PIC_IRT_NA_0_INDEX) -#define PIC_IRT_POE_INDEX 114 - -#define PIC_NUM_USB_IRTS 6 -#define PIC_IRT_USB_0_INDEX 115 -#define PIC_IRT_EHCI_0_INDEX 115 -#define PIC_IRT_OHCI_0_INDEX 116 -#define PIC_IRT_OHCI_1_INDEX 117 -#define PIC_IRT_EHCI_1_INDEX 118 -#define PIC_IRT_OHCI_2_INDEX 119 -#define PIC_IRT_OHCI_3_INDEX 120 -#define PIC_IRT_USB_INDEX(num) ((num) + PIC_IRT_USB_0_INDEX) -/* 115 to 120 */ -#define PIC_IRT_GDX_INDEX 121 -#define PIC_IRT_SEC_INDEX 122 -#define PIC_IRT_RSA_INDEX 123 - -#define PIC_NUM_COMP_IRTS 4 -#define PIC_IRT_COMP_0_INDEX 124 -#define PIC_IRT_COMP_INDEX(num) ((num) + PIC_IRT_COMP_0_INDEX) -/* 124 to 127 */ -#define PIC_IRT_GBU_INDEX 128 -#define PIC_IRT_ICC_0_INDEX 129 /* ICC - Inter Chip Coherency */ -#define PIC_IRT_ICC_1_INDEX 130 -#define PIC_IRT_ICC_2_INDEX 131 -#define PIC_IRT_CAM_INDEX 132 -#define PIC_IRT_UART_0_INDEX 133 -#define PIC_IRT_UART_1_INDEX 134 -#define PIC_IRT_I2C_0_INDEX 135 -#define PIC_IRT_I2C_1_INDEX 136 -#define PIC_IRT_SYS_0_INDEX 137 -#define PIC_IRT_SYS_1_INDEX 138 -#define PIC_IRT_JTAG_INDEX 139 -#define PIC_IRT_PIC_INDEX 140 -#define PIC_IRT_NBU_INDEX 141 -#define PIC_IRT_TCU_INDEX 142 -#define PIC_IRT_GCU_INDEX 143 /* GBC - Global Coherency */ -#define PIC_IRT_DMC_0_INDEX 144 -#define PIC_IRT_DMC_1_INDEX 145 - -#define PIC_NUM_GPIO_IRTS 4 -#define PIC_IRT_GPIO_0_INDEX 146 -#define PIC_IRT_GPIO_INDEX(num) ((num) + PIC_IRT_GPIO_0_INDEX) - -/* 146 to 149 */ -#define PIC_IRT_NOR_INDEX 150 -#define PIC_IRT_NAND_INDEX 151 -#define PIC_IRT_SPI_INDEX 152 -#define PIC_IRT_MMC_INDEX 153 #define PIC_CLOCK_TIMER 7 #define PIC_IRQ_BASE 8 diff --git a/arch/mips/include/asm/netlogic/xlp-hal/usb.h b/arch/mips/include/asm/netlogic/xlp-hal/usb.h deleted file mode 100644 index a9cd350dfb6c..000000000000 --- a/arch/mips/include/asm/netlogic/xlp-hal/usb.h +++ /dev/null @@ -1,64 +0,0 @@ -/* - * Copyright (c) 2003-2012 Broadcom Corporation - * All Rights Reserved - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the Broadcom - * license below: - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * - * THIS SOFTWARE IS PROVIDED BY BROADCOM ``AS IS'' AND ANY EXPRESS OR - * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED - * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL BROADCOM OR CONTRIBUTORS BE LIABLE - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR - * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, - * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE - * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN - * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - */ - -#ifndef __NLM_HAL_USB_H__ -#define __NLM_HAL_USB_H__ - -#define USB_CTL_0 0x01 -#define USB_PHY_0 0x0A -#define USB_PHY_RESET 0x01 -#define USB_PHY_PORT_RESET_0 0x10 -#define USB_PHY_PORT_RESET_1 0x20 -#define USB_CONTROLLER_RESET 0x01 -#define USB_INT_STATUS 0x0E -#define USB_INT_EN 0x0F -#define USB_PHY_INTERRUPT_EN 0x01 -#define USB_OHCI_INTERRUPT_EN 0x02 -#define USB_OHCI_INTERRUPT1_EN 0x04 -#define USB_OHCI_INTERRUPT2_EN 0x08 -#define USB_CTRL_INTERRUPT_EN 0x10 - -#ifndef __ASSEMBLY__ - -#define nlm_read_usb_reg(b, r) nlm_read_reg(b, r) -#define nlm_write_usb_reg(b, r, v) nlm_write_reg(b, r, v) -#define nlm_get_usb_pcibase(node, inst) \ - nlm_pcicfg_base(XLP_IO_USB_OFFSET(node, inst)) -#define nlm_get_usb_hcd_base(node, inst) \ - nlm_xkphys_map_pcibar0(nlm_get_usb_pcibase(node, inst)) -#define nlm_get_usb_regbase(node, inst) \ - (nlm_get_usb_pcibase(node, inst) + XLP_IO_PCI_HDRSZ) - -#endif -#endif /* __NLM_HAL_USB_H__ */ diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h index fdc62fb5630d..8b8f6b393363 100644 --- a/arch/mips/include/asm/pgtable.h +++ b/arch/mips/include/asm/pgtable.h @@ -8,6 +8,7 @@ #ifndef _ASM_PGTABLE_H #define _ASM_PGTABLE_H +#include #include #ifdef CONFIG_32BIT #include diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h index 2a5fa7abb346..71686c897dea 100644 --- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -44,11 +44,16 @@ extern unsigned int vced_count, vcei_count; #define SPECIAL_PAGES_SIZE PAGE_SIZE #ifdef CONFIG_32BIT +#ifdef CONFIG_KVM_GUEST +/* User space process size is limited to 1GB in KVM Guest Mode */ +#define TASK_SIZE 0x3fff8000UL +#else /* * User space process size: 2GB. This is hardcoded into a few places, * so don't change it unless you know what you are doing. */ #define TASK_SIZE 0x7fff8000UL +#endif #ifdef __KERNEL__ #define STACK_TOP_MAX TASK_SIZE diff --git a/arch/mips/include/asm/prom.h b/arch/mips/include/asm/prom.h index 8808bf548b99..1e7e0961064b 100644 --- a/arch/mips/include/asm/prom.h +++ b/arch/mips/include/asm/prom.h @@ -48,4 +48,7 @@ extern void __dt_setup_arch(struct boot_param_header *bph); static inline void device_tree_init(void) { } #endif /* CONFIG_OF */ +extern char *mips_get_machine_name(void); +extern void mips_set_machine_name(const char *name); + #endif /* __ASM_PROM_H */ diff --git a/arch/mips/include/asm/sn/sn_private.h b/arch/mips/include/asm/sn/sn_private.h index 1a2c3025bf28..fdfae43d8b99 100644 --- a/arch/mips/include/asm/sn/sn_private.h +++ b/arch/mips/include/asm/sn/sn_private.h @@ -14,6 +14,6 @@ extern void install_cpu_nmi_handler(int slice); extern void install_ipi(void); extern void setup_replication_mask(void); extern void replicate_kernel_text(void); -extern pfn_t node_getfirstfree(cnodeid_t); +extern unsigned long node_getfirstfree(cnodeid_t); #endif /* __ASM_SN_SN_PRIVATE_H */ diff --git a/arch/mips/include/asm/sn/types.h b/arch/mips/include/asm/sn/types.h index c4813d67aec3..6d24d4e8b9ed 100644 --- a/arch/mips/include/asm/sn/types.h +++ b/arch/mips/include/asm/sn/types.h @@ -19,7 +19,6 @@ typedef signed char partid_t; /* partition ID type */ typedef signed short moduleid_t; /* user-visible module number type */ typedef signed short cmoduleid_t; /* kernel compact module id type */ typedef unsigned char clusterid_t; /* Clusterid of the cell */ -typedef unsigned long pfn_t; typedef dev_t vertex_hdl_t; /* hardware graph vertex handle */ diff --git a/arch/mips/include/asm/spinlock.h b/arch/mips/include/asm/spinlock.h index 5130c88d6420..78d201fb6c87 100644 --- a/arch/mips/include/asm/spinlock.h +++ b/arch/mips/include/asm/spinlock.h @@ -71,7 +71,6 @@ static inline void arch_spin_lock(arch_spinlock_t *lock) " nop \n" " srl %[my_ticket], %[ticket], 16 \n" " andi %[ticket], %[ticket], 0xffff \n" - " andi %[my_ticket], %[my_ticket], 0xffff \n" " bne %[ticket], %[my_ticket], 4f \n" " subu %[ticket], %[my_ticket], %[ticket] \n" "2: \n" @@ -105,7 +104,6 @@ static inline void arch_spin_lock(arch_spinlock_t *lock) " beqz %[my_ticket], 1b \n" " srl %[my_ticket], %[ticket], 16 \n" " andi %[ticket], %[ticket], 0xffff \n" - " andi %[my_ticket], %[my_ticket], 0xffff \n" " bne %[ticket], %[my_ticket], 4f \n" " subu %[ticket], %[my_ticket], %[ticket] \n" "2: \n" @@ -153,7 +151,6 @@ static inline unsigned int arch_spin_trylock(arch_spinlock_t *lock) " \n" "1: ll %[ticket], %[ticket_ptr] \n" " srl %[my_ticket], %[ticket], 16 \n" - " andi %[my_ticket], %[my_ticket], 0xffff \n" " andi %[now_serving], %[ticket], 0xffff \n" " bne %[my_ticket], %[now_serving], 3f \n" " addu %[ticket], %[ticket], %[inc] \n" @@ -178,7 +175,6 @@ static inline unsigned int arch_spin_trylock(arch_spinlock_t *lock) " \n" "1: ll %[ticket], %[ticket_ptr] \n" " srl %[my_ticket], %[ticket], 16 \n" - " andi %[my_ticket], %[my_ticket], 0xffff \n" " andi %[now_serving], %[ticket], 0xffff \n" " bne %[my_ticket], %[now_serving], 3f \n" " addu %[ticket], %[ticket], %[inc] \n" @@ -242,25 +238,16 @@ static inline void arch_read_lock(arch_rwlock_t *rw) : "m" (rw->lock) : "memory"); } else { - __asm__ __volatile__( - " .set noreorder # arch_read_lock \n" - "1: ll %1, %2 \n" - " bltz %1, 3f \n" - " addu %1, 1 \n" - "2: sc %1, %0 \n" - " beqz %1, 1b \n" - " nop \n" - " .subsection 2 \n" - "3: ll %1, %2 \n" - " bltz %1, 3b \n" - " addu %1, 1 \n" - " b 2b \n" - " nop \n" - " .previous \n" - " .set reorder \n" - : "=m" (rw->lock), "=&r" (tmp) - : "m" (rw->lock) - : "memory"); + do { + __asm__ __volatile__( + "1: ll %1, %2 # arch_read_lock \n" + " bltz %1, 1b \n" + " addu %1, 1 \n" + "2: sc %1, %0 \n" + : "=m" (rw->lock), "=&r" (tmp) + : "m" (rw->lock) + : "memory"); + } while (unlikely(!tmp)); } smp_llsc_mb(); @@ -285,21 +272,15 @@ static inline void arch_read_unlock(arch_rwlock_t *rw) : "m" (rw->lock) : "memory"); } else { - __asm__ __volatile__( - " .set noreorder # arch_read_unlock \n" - "1: ll %1, %2 \n" - " sub %1, 1 \n" - " sc %1, %0 \n" - " beqz %1, 2f \n" - " nop \n" - " .subsection 2 \n" - "2: b 1b \n" - " nop \n" - " .previous \n" - " .set reorder \n" - : "=m" (rw->lock), "=&r" (tmp) - : "m" (rw->lock) - : "memory"); + do { + __asm__ __volatile__( + "1: ll %1, %2 # arch_read_unlock \n" + " sub %1, 1 \n" + " sc %1, %0 \n" + : "=m" (rw->lock), "=&r" (tmp) + : "m" (rw->lock) + : "memory"); + } while (unlikely(!tmp)); } } @@ -321,25 +302,16 @@ static inline void arch_write_lock(arch_rwlock_t *rw) : "m" (rw->lock) : "memory"); } else { - __asm__ __volatile__( - " .set noreorder # arch_write_lock \n" - "1: ll %1, %2 \n" - " bnez %1, 3f \n" - " lui %1, 0x8000 \n" - "2: sc %1, %0 \n" - " beqz %1, 3f \n" - " nop \n" - " .subsection 2 \n" - "3: ll %1, %2 \n" - " bnez %1, 3b \n" - " lui %1, 0x8000 \n" - " b 2b \n" - " nop \n" - " .previous \n" - " .set reorder \n" - : "=m" (rw->lock), "=&r" (tmp) - : "m" (rw->lock) - : "memory"); + do { + __asm__ __volatile__( + "1: ll %1, %2 # arch_write_lock \n" + " bnez %1, 1b \n" + " lui %1, 0x8000 \n" + "2: sc %1, %0 \n" + : "=m" (rw->lock), "=&r" (tmp) + : "m" (rw->lock) + : "memory"); + } while (unlikely(!tmp)); } smp_llsc_mb(); @@ -424,25 +396,21 @@ static inline int arch_write_trylock(arch_rwlock_t *rw) : "m" (rw->lock) : "memory"); } else { - __asm__ __volatile__( - " .set noreorder # arch_write_trylock \n" - " li %2, 0 \n" - "1: ll %1, %3 \n" - " bnez %1, 2f \n" - " lui %1, 0x8000 \n" - " sc %1, %0 \n" - " beqz %1, 3f \n" - " li %2, 1 \n" - "2: \n" - __WEAK_LLSC_MB - " .subsection 2 \n" - "3: b 1b \n" - " li %2, 0 \n" - " .previous \n" - " .set reorder \n" - : "=m" (rw->lock), "=&r" (tmp), "=&r" (ret) - : "m" (rw->lock) - : "memory"); + do { + __asm__ __volatile__( + " ll %1, %3 # arch_write_trylock \n" + " li %2, 0 \n" + " bnez %1, 2f \n" + " lui %1, 0x8000 \n" + " sc %1, %0 \n" + " li %2, 1 \n" + "2: \n" + : "=m" (rw->lock), "=&r" (tmp), "=&r" (ret) + : "m" (rw->lock) + : "memory"); + } while (unlikely(!tmp)); + + smp_llsc_mb(); } return ret; diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h index c99384018161..a89d1b10d027 100644 --- a/arch/mips/include/asm/stackframe.h +++ b/arch/mips/include/asm/stackframe.h @@ -139,7 +139,7 @@ 1: move ra, k0 li k0, 3 mtc0 k0, $22 -#endif /* CONFIG_CPU_LOONGSON2F */ +#endif /* CONFIG_CPU_JUMP_WORKAROUNDS */ #if defined(CONFIG_32BIT) || defined(KBUILD_64BIT_SYM32) lui k1, %hi(kernelsp) #else @@ -189,6 +189,7 @@ LONG_S $0, PT_R0(sp) mfc0 v1, CP0_STATUS LONG_S $2, PT_R2(sp) + LONG_S v1, PT_STATUS(sp) #ifdef CONFIG_MIPS_MT_SMTC /* * Ideally, these instructions would be shuffled in @@ -200,21 +201,20 @@ LONG_S k0, PT_TCSTATUS(sp) #endif /* CONFIG_MIPS_MT_SMTC */ LONG_S $4, PT_R4(sp) - LONG_S $5, PT_R5(sp) - LONG_S v1, PT_STATUS(sp) mfc0 v1, CP0_CAUSE - LONG_S $6, PT_R6(sp) - LONG_S $7, PT_R7(sp) + LONG_S $5, PT_R5(sp) LONG_S v1, PT_CAUSE(sp) + LONG_S $6, PT_R6(sp) MFC0 v1, CP0_EPC + LONG_S $7, PT_R7(sp) #ifdef CONFIG_64BIT LONG_S $8, PT_R8(sp) LONG_S $9, PT_R9(sp) #endif + LONG_S v1, PT_EPC(sp) LONG_S $25, PT_R25(sp) LONG_S $28, PT_R28(sp) LONG_S $31, PT_R31(sp) - LONG_S v1, PT_EPC(sp) ori $28, sp, _THREAD_MASK xori $28, _THREAD_MASK #ifdef CONFIG_CPU_CAVIUM_OCTEON diff --git a/arch/mips/include/asm/thread_info.h b/arch/mips/include/asm/thread_info.h index 178f7924149a..895320e25662 100644 --- a/arch/mips/include/asm/thread_info.h +++ b/arch/mips/include/asm/thread_info.h @@ -58,8 +58,12 @@ struct thread_info { #define init_stack (init_thread_union.stack) /* How to get the thread information struct from C. */ -register struct thread_info *__current_thread_info __asm__("$28"); -#define current_thread_info() __current_thread_info +static inline struct thread_info *current_thread_info(void) +{ + register struct thread_info *__current_thread_info __asm__("$28"); + + return __current_thread_info; +} #endif /* !__ASSEMBLY__ */ diff --git a/arch/mips/include/asm/time.h b/arch/mips/include/asm/time.h index debc8009bd58..2d7b9df4542d 100644 --- a/arch/mips/include/asm/time.h +++ b/arch/mips/include/asm/time.h @@ -52,13 +52,15 @@ extern int (*perf_irq)(void); */ extern unsigned int __weak get_c0_compare_int(void); extern int r4k_clockevent_init(void); +extern int smtc_clockevent_init(void); +extern int gic_clockevent_init(void); static inline int mips_clockevent_init(void) { #ifdef CONFIG_MIPS_MT_SMTC - extern int smtc_clockevent_init(void); - return smtc_clockevent_init(); +#elif defined(CONFIG_CEVT_GIC) + return (gic_clockevent_init() | r4k_clockevent_init()); #elif defined(CONFIG_CEVT_R4K) return r4k_clockevent_init(); #else @@ -69,9 +71,7 @@ static inline int mips_clockevent_init(void) /* * Initialize the count register as a clocksource */ -#ifdef CONFIG_CSRC_R4K extern int init_r4k_clocksource(void); -#endif static inline int init_mips_clocksource(void) { diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h index bd87e36bf26a..f3fa3750f577 100644 --- a/arch/mips/include/asm/uaccess.h +++ b/arch/mips/include/asm/uaccess.h @@ -23,7 +23,11 @@ */ #ifdef CONFIG_32BIT -#define __UA_LIMIT 0x80000000UL +#ifdef CONFIG_KVM_GUEST +#define __UA_LIMIT 0x40000000UL +#else +#define __UA_LIMIT 0x80000000UL +#endif #define __UA_ADDR ".word" #define __UA_LA "la" @@ -55,8 +59,13 @@ extern u64 __ua_limit; * address in this range it's the process's problem, not ours :-) */ +#ifdef CONFIG_KVM_GUEST +#define KERNEL_DS ((mm_segment_t) { 0x80000000UL }) +#define USER_DS ((mm_segment_t) { 0xC0000000UL }) +#else #define KERNEL_DS ((mm_segment_t) { 0UL }) #define USER_DS ((mm_segment_t) { __UA_LIMIT }) +#endif #define VERIFY_READ 0 #define VERIFY_WRITE 1 @@ -261,6 +270,7 @@ do { \ __asm__ __volatile__( \ "1: " insn " %1, %3 \n" \ "2: \n" \ + " .insn \n" \ " .section .fixup,\"ax\" \n" \ "3: li %0, %4 \n" \ " j 2b \n" \ @@ -287,7 +297,9 @@ do { \ __asm__ __volatile__( \ "1: lw %1, (%3) \n" \ "2: lw %D1, 4(%3) \n" \ - "3: .section .fixup,\"ax\" \n" \ + "3: \n" \ + " .insn \n" \ + " .section .fixup,\"ax\" \n" \ "4: li %0, %4 \n" \ " move %1, $0 \n" \ " move %D1, $0 \n" \ @@ -355,6 +367,7 @@ do { \ __asm__ __volatile__( \ "1: " insn " %z2, %3 # __put_user_asm\n" \ "2: \n" \ + " .insn \n" \ " .section .fixup,\"ax\" \n" \ "3: li %0, %4 \n" \ " j 2b \n" \ @@ -373,6 +386,7 @@ do { \ "1: sw %2, (%3) # __put_user_asm_ll32 \n" \ "2: sw %D2, 4(%3) \n" \ "3: \n" \ + " .insn \n" \ " .section .fixup,\"ax\" \n" \ "4: li %0, %4 \n" \ " j 3b \n" \ @@ -524,6 +538,7 @@ do { \ __asm__ __volatile__( \ "1: " insn " %1, %3 \n" \ "2: \n" \ + " .insn \n" \ " .section .fixup,\"ax\" \n" \ "3: li %0, %4 \n" \ " j 2b \n" \ @@ -549,7 +564,9 @@ do { \ "1: ulw %1, (%3) \n" \ "2: ulw %D1, 4(%3) \n" \ " move %0, $0 \n" \ - "3: .section .fixup,\"ax\" \n" \ + "3: \n" \ + " .insn \n" \ + " .section .fixup,\"ax\" \n" \ "4: li %0, %4 \n" \ " move %1, $0 \n" \ " move %D1, $0 \n" \ @@ -616,6 +633,7 @@ do { \ __asm__ __volatile__( \ "1: " insn " %z2, %3 # __put_user_unaligned_asm\n" \ "2: \n" \ + " .insn \n" \ " .section .fixup,\"ax\" \n" \ "3: li %0, %4 \n" \ " j 2b \n" \ @@ -634,6 +652,7 @@ do { \ "1: sw %2, (%3) # __put_user_unaligned_asm_ll32 \n" \ "2: sw %D2, 4(%3) \n" \ "3: \n" \ + " .insn \n" \ " .section .fixup,\"ax\" \n" \ "4: li %0, %4 \n" \ " j 3b \n" \ diff --git a/arch/mips/include/asm/uasm.h b/arch/mips/include/asm/uasm.h index 058e941626a6..370d967725c2 100644 --- a/arch/mips/include/asm/uasm.h +++ b/arch/mips/include/asm/uasm.h @@ -6,7 +6,7 @@ * Copyright (C) 2004, 2005, 2006, 2008 Thiemo Seufer * Copyright (C) 2005 Maciej W. Rozycki * Copyright (C) 2006 Ralf Baechle (ralf@linux-mips.org) - * Copyright (C) 2012 MIPS Technologies, Inc. + * Copyright (C) 2012, 2013 MIPS Technologies, Inc. All rights reserved. */ #include @@ -22,44 +22,75 @@ #define UASM_EXPORT_SYMBOL(sym) #endif +#define _UASM_ISA_CLASSIC 0 +#define _UASM_ISA_MICROMIPS 1 + +#ifndef UASM_ISA +#ifdef CONFIG_CPU_MICROMIPS +#define UASM_ISA _UASM_ISA_MICROMIPS +#else +#define UASM_ISA _UASM_ISA_CLASSIC +#endif +#endif + +#if (UASM_ISA == _UASM_ISA_CLASSIC) +#ifdef CONFIG_CPU_MICROMIPS +#define ISAOPC(op) CL_uasm_i##op +#define ISAFUNC(x) CL_##x +#else +#define ISAOPC(op) uasm_i##op +#define ISAFUNC(x) x +#endif +#elif (UASM_ISA == _UASM_ISA_MICROMIPS) +#ifdef CONFIG_CPU_MICROMIPS +#define ISAOPC(op) uasm_i##op +#define ISAFUNC(x) x +#else +#define ISAOPC(op) MM_uasm_i##op +#define ISAFUNC(x) MM_##x +#endif +#else +#error Unsupported micro-assembler ISA!!! +#endif + #define Ip_u1u2u3(op) \ void __uasminit \ -uasm_i##op(u32 **buf, unsigned int a, unsigned int b, unsigned int c) +ISAOPC(op)(u32 **buf, unsigned int a, unsigned int b, unsigned int c) #define Ip_u2u1u3(op) \ void __uasminit \ -uasm_i##op(u32 **buf, unsigned int a, unsigned int b, unsigned int c) +ISAOPC(op)(u32 **buf, unsigned int a, unsigned int b, unsigned int c) #define Ip_u3u1u2(op) \ void __uasminit \ -uasm_i##op(u32 **buf, unsigned int a, unsigned int b, unsigned int c) +ISAOPC(op)(u32 **buf, unsigned int a, unsigned int b, unsigned int c) #define Ip_u1u2s3(op) \ void __uasminit \ -uasm_i##op(u32 **buf, unsigned int a, unsigned int b, signed int c) +ISAOPC(op)(u32 **buf, unsigned int a, unsigned int b, signed int c) #define Ip_u2s3u1(op) \ void __uasminit \ -uasm_i##op(u32 **buf, unsigned int a, signed int b, unsigned int c) +ISAOPC(op)(u32 **buf, unsigned int a, signed int b, unsigned int c) #define Ip_u2u1s3(op) \ void __uasminit \ -uasm_i##op(u32 **buf, unsigned int a, unsigned int b, signed int c) +ISAOPC(op)(u32 **buf, unsigned int a, unsigned int b, signed int c) #define Ip_u2u1msbu3(op) \ void __uasminit \ -uasm_i##op(u32 **buf, unsigned int a, unsigned int b, unsigned int c, \ +ISAOPC(op)(u32 **buf, unsigned int a, unsigned int b, unsigned int c, \ unsigned int d) #define Ip_u1u2(op) \ -void __uasminit uasm_i##op(u32 **buf, unsigned int a, unsigned int b) +void __uasminit ISAOPC(op)(u32 **buf, unsigned int a, unsigned int b) #define Ip_u1s2(op) \ -void __uasminit uasm_i##op(u32 **buf, unsigned int a, signed int b) +void __uasminit ISAOPC(op)(u32 **buf, unsigned int a, signed int b) -#define Ip_u1(op) void __uasminit uasm_i##op(u32 **buf, unsigned int a) +#define Ip_u1(op) void __uasminit ISAOPC(op)(u32 **buf, unsigned int a) -#define Ip_0(op) void __uasminit uasm_i##op(u32 **buf) +#define Ip_0(op) void __uasminit ISAOPC(op)(u32 **buf) Ip_u2u1s3(_addiu); Ip_u3u1u2(_addu); @@ -132,19 +163,20 @@ struct uasm_label { int lab; }; -void __uasminit uasm_build_label(struct uasm_label **lab, u32 *addr, int lid); +void __uasminit ISAFUNC(uasm_build_label)(struct uasm_label **lab, u32 *addr, + int lid); #ifdef CONFIG_64BIT -int uasm_in_compat_space_p(long addr); +int ISAFUNC(uasm_in_compat_space_p)(long addr); #endif -int uasm_rel_hi(long val); -int uasm_rel_lo(long val); -void UASM_i_LA_mostly(u32 **buf, unsigned int rs, long addr); -void UASM_i_LA(u32 **buf, unsigned int rs, long addr); +int ISAFUNC(uasm_rel_hi)(long val); +int ISAFUNC(uasm_rel_lo)(long val); +void ISAFUNC(UASM_i_LA_mostly)(u32 **buf, unsigned int rs, long addr); +void ISAFUNC(UASM_i_LA)(u32 **buf, unsigned int rs, long addr); #define UASM_L_LA(lb) \ -static inline void __uasminit uasm_l##lb(struct uasm_label **lab, u32 *addr) \ +static inline void __uasminit ISAFUNC(uasm_l##lb)(struct uasm_label **lab, u32 *addr) \ { \ - uasm_build_label(lab, addr, label##lb); \ + ISAFUNC(uasm_build_label)(lab, addr, label##lb); \ } /* convenience macros for instructions */ @@ -196,27 +228,27 @@ static inline void uasm_i_drotr_safe(u32 **p, unsigned int a1, unsigned int a2, unsigned int a3) { if (a3 < 32) - uasm_i_drotr(p, a1, a2, a3); + ISAOPC(_drotr)(p, a1, a2, a3); else - uasm_i_drotr32(p, a1, a2, a3 - 32); + ISAOPC(_drotr32)(p, a1, a2, a3 - 32); } static inline void uasm_i_dsll_safe(u32 **p, unsigned int a1, unsigned int a2, unsigned int a3) { if (a3 < 32) - uasm_i_dsll(p, a1, a2, a3); + ISAOPC(_dsll)(p, a1, a2, a3); else - uasm_i_dsll32(p, a1, a2, a3 - 32); + ISAOPC(_dsll32)(p, a1, a2, a3 - 32); } static inline void uasm_i_dsrl_safe(u32 **p, unsigned int a1, unsigned int a2, unsigned int a3) { if (a3 < 32) - uasm_i_dsrl(p, a1, a2, a3); + ISAOPC(_dsrl)(p, a1, a2, a3); else - uasm_i_dsrl32(p, a1, a2, a3 - 32); + ISAOPC(_dsrl32)(p, a1, a2, a3 - 32); } /* Handle relocations. */ diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h index 4d078815eaa5..0f4aec2ad1e6 100644 --- a/arch/mips/include/uapi/asm/inst.h +++ b/arch/mips/include/uapi/asm/inst.h @@ -7,6 +7,7 @@ * * Copyright (C) 1996, 2000 by Ralf Baechle * Copyright (C) 2006 by Thiemo Seufer + * Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. */ #ifndef _UAPI_ASM_INST_H #define _UAPI_ASM_INST_H @@ -192,6 +193,282 @@ enum lx_func { lbx_op = 0x16, }; +/* + * (microMIPS) Major opcodes. + */ +enum mm_major_op { + mm_pool32a_op, mm_pool16a_op, mm_lbu16_op, mm_move16_op, + mm_addi32_op, mm_lbu32_op, mm_sb32_op, mm_lb32_op, + mm_pool32b_op, mm_pool16b_op, mm_lhu16_op, mm_andi16_op, + mm_addiu32_op, mm_lhu32_op, mm_sh32_op, mm_lh32_op, + mm_pool32i_op, mm_pool16c_op, mm_lwsp16_op, mm_pool16d_op, + mm_ori32_op, mm_pool32f_op, mm_reserved1_op, mm_reserved2_op, + mm_pool32c_op, mm_lwgp16_op, mm_lw16_op, mm_pool16e_op, + mm_xori32_op, mm_jals32_op, mm_addiupc_op, mm_reserved3_op, + mm_reserved4_op, mm_pool16f_op, mm_sb16_op, mm_beqz16_op, + mm_slti32_op, mm_beq32_op, mm_swc132_op, mm_lwc132_op, + mm_reserved5_op, mm_reserved6_op, mm_sh16_op, mm_bnez16_op, + mm_sltiu32_op, mm_bne32_op, mm_sdc132_op, mm_ldc132_op, + mm_reserved7_op, mm_reserved8_op, mm_swsp16_op, mm_b16_op, + mm_andi32_op, mm_j32_op, mm_sd32_op, mm_ld32_op, + mm_reserved11_op, mm_reserved12_op, mm_sw16_op, mm_li16_op, + mm_jalx32_op, mm_jal32_op, mm_sw32_op, mm_lw32_op, +}; + +/* + * (microMIPS) POOL32I minor opcodes. + */ +enum mm_32i_minor_op { + mm_bltz_op, mm_bltzal_op, mm_bgez_op, mm_bgezal_op, + mm_blez_op, mm_bnezc_op, mm_bgtz_op, mm_beqzc_op, + mm_tlti_op, mm_tgei_op, mm_tltiu_op, mm_tgeiu_op, + mm_tnei_op, mm_lui_op, mm_teqi_op, mm_reserved13_op, + mm_synci_op, mm_bltzals_op, mm_reserved14_op, mm_bgezals_op, + mm_bc2f_op, mm_bc2t_op, mm_reserved15_op, mm_reserved16_op, + mm_reserved17_op, mm_reserved18_op, mm_bposge64_op, mm_bposge32_op, + mm_bc1f_op, mm_bc1t_op, mm_reserved19_op, mm_reserved20_op, + mm_bc1any2f_op, mm_bc1any2t_op, mm_bc1any4f_op, mm_bc1any4t_op, +}; + +/* + * (microMIPS) POOL32A minor opcodes. + */ +enum mm_32a_minor_op { + mm_sll32_op = 0x000, + mm_ins_op = 0x00c, + mm_ext_op = 0x02c, + mm_pool32axf_op = 0x03c, + mm_srl32_op = 0x040, + mm_sra_op = 0x080, + mm_rotr_op = 0x0c0, + mm_lwxs_op = 0x118, + mm_addu32_op = 0x150, + mm_subu32_op = 0x1d0, + mm_and_op = 0x250, + mm_or32_op = 0x290, + mm_xor32_op = 0x310, +}; + +/* + * (microMIPS) POOL32B functions. + */ +enum mm_32b_func { + mm_lwc2_func = 0x0, + mm_lwp_func = 0x1, + mm_ldc2_func = 0x2, + mm_ldp_func = 0x4, + mm_lwm32_func = 0x5, + mm_cache_func = 0x6, + mm_ldm_func = 0x7, + mm_swc2_func = 0x8, + mm_swp_func = 0x9, + mm_sdc2_func = 0xa, + mm_sdp_func = 0xc, + mm_swm32_func = 0xd, + mm_sdm_func = 0xf, +}; + +/* + * (microMIPS) POOL32C functions. + */ +enum mm_32c_func { + mm_pref_func = 0x2, + mm_ll_func = 0x3, + mm_swr_func = 0x9, + mm_sc_func = 0xb, + mm_lwu_func = 0xe, +}; + +/* + * (microMIPS) POOL32AXF minor opcodes. + */ +enum mm_32axf_minor_op { + mm_mfc0_op = 0x003, + mm_mtc0_op = 0x00b, + mm_tlbp_op = 0x00d, + mm_jalr_op = 0x03c, + mm_tlbr_op = 0x04d, + mm_jalrhb_op = 0x07c, + mm_tlbwi_op = 0x08d, + mm_tlbwr_op = 0x0cd, + mm_jalrs_op = 0x13c, + mm_jalrshb_op = 0x17c, + mm_syscall_op = 0x22d, + mm_eret_op = 0x3cd, +}; + +/* + * (microMIPS) POOL32F minor opcodes. + */ +enum mm_32f_minor_op { + mm_32f_00_op = 0x00, + mm_32f_01_op = 0x01, + mm_32f_02_op = 0x02, + mm_32f_10_op = 0x08, + mm_32f_11_op = 0x09, + mm_32f_12_op = 0x0a, + mm_32f_20_op = 0x10, + mm_32f_30_op = 0x18, + mm_32f_40_op = 0x20, + mm_32f_41_op = 0x21, + mm_32f_42_op = 0x22, + mm_32f_50_op = 0x28, + mm_32f_51_op = 0x29, + mm_32f_52_op = 0x2a, + mm_32f_60_op = 0x30, + mm_32f_70_op = 0x38, + mm_32f_73_op = 0x3b, + mm_32f_74_op = 0x3c, +}; + +/* + * (microMIPS) POOL32F secondary minor opcodes. + */ +enum mm_32f_10_minor_op { + mm_lwxc1_op = 0x1, + mm_swxc1_op, + mm_ldxc1_op, + mm_sdxc1_op, + mm_luxc1_op, + mm_suxc1_op, +}; + +enum mm_32f_func { + mm_lwxc1_func = 0x048, + mm_swxc1_func = 0x088, + mm_ldxc1_func = 0x0c8, + mm_sdxc1_func = 0x108, +}; + +/* + * (microMIPS) POOL32F secondary minor opcodes. + */ +enum mm_32f_40_minor_op { + mm_fmovf_op, + mm_fmovt_op, +}; + +/* + * (microMIPS) POOL32F secondary minor opcodes. + */ +enum mm_32f_60_minor_op { + mm_fadd_op, + mm_fsub_op, + mm_fmul_op, + mm_fdiv_op, +}; + +/* + * (microMIPS) POOL32F secondary minor opcodes. + */ +enum mm_32f_70_minor_op { + mm_fmovn_op, + mm_fmovz_op, +}; + +/* + * (microMIPS) POOL32FXF secondary minor opcodes for POOL32F. + */ +enum mm_32f_73_minor_op { + mm_fmov0_op = 0x01, + mm_fcvtl_op = 0x04, + mm_movf0_op = 0x05, + mm_frsqrt_op = 0x08, + mm_ffloorl_op = 0x0c, + mm_fabs0_op = 0x0d, + mm_fcvtw_op = 0x24, + mm_movt0_op = 0x25, + mm_fsqrt_op = 0x28, + mm_ffloorw_op = 0x2c, + mm_fneg0_op = 0x2d, + mm_cfc1_op = 0x40, + mm_frecip_op = 0x48, + mm_fceill_op = 0x4c, + mm_fcvtd0_op = 0x4d, + mm_ctc1_op = 0x60, + mm_fceilw_op = 0x6c, + mm_fcvts0_op = 0x6d, + mm_mfc1_op = 0x80, + mm_fmov1_op = 0x81, + mm_movf1_op = 0x85, + mm_ftruncl_op = 0x8c, + mm_fabs1_op = 0x8d, + mm_mtc1_op = 0xa0, + mm_movt1_op = 0xa5, + mm_ftruncw_op = 0xac, + mm_fneg1_op = 0xad, + mm_froundl_op = 0xcc, + mm_fcvtd1_op = 0xcd, + mm_froundw_op = 0xec, + mm_fcvts1_op = 0xed, +}; + +/* + * (microMIPS) POOL16C minor opcodes. + */ +enum mm_16c_minor_op { + mm_lwm16_op = 0x04, + mm_swm16_op = 0x05, + mm_jr16_op = 0x18, + mm_jrc_op = 0x1a, + mm_jalr16_op = 0x1c, + mm_jalrs16_op = 0x1e, +}; + +/* + * (microMIPS) POOL16D minor opcodes. + */ +enum mm_16d_minor_op { + mm_addius5_func, + mm_addiusp_func, +}; + +/* + * (MIPS16e) opcodes. + */ +enum MIPS16e_ops { + MIPS16e_jal_op = 003, + MIPS16e_ld_op = 007, + MIPS16e_i8_op = 014, + MIPS16e_sd_op = 017, + MIPS16e_lb_op = 020, + MIPS16e_lh_op = 021, + MIPS16e_lwsp_op = 022, + MIPS16e_lw_op = 023, + MIPS16e_lbu_op = 024, + MIPS16e_lhu_op = 025, + MIPS16e_lwpc_op = 026, + MIPS16e_lwu_op = 027, + MIPS16e_sb_op = 030, + MIPS16e_sh_op = 031, + MIPS16e_swsp_op = 032, + MIPS16e_sw_op = 033, + MIPS16e_rr_op = 035, + MIPS16e_extend_op = 036, + MIPS16e_i64_op = 037, +}; + +enum MIPS16e_i64_func { + MIPS16e_ldsp_func, + MIPS16e_sdsp_func, + MIPS16e_sdrasp_func, + MIPS16e_dadjsp_func, + MIPS16e_ldpc_func, +}; + +enum MIPS16e_rr_func { + MIPS16e_jr_func, +}; + +enum MIPS6e_i8_func { + MIPS16e_swrasp_func = 02, +}; + +/* + * (microMIPS & MIPS16e) NOP instruction. + */ +#define MM_NOP16 0x0c00 + /* * Damn ... bitfields depend from byteorder :-( */ @@ -311,6 +588,262 @@ struct v_format { /* MDMX vector format */ ;))))))) }; +/* + * microMIPS instruction formats (32-bit length) + * + * NOTE: + * Parenthesis denote whether the format is a microMIPS instruction or + * if it is MIPS32 instruction re-encoded for use in the microMIPS ASE. + */ +struct fb_format { /* FPU branch format (MIPS32) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int bc : 5, + BITFIELD_FIELD(unsigned int cc : 3, + BITFIELD_FIELD(unsigned int flag : 2, + BITFIELD_FIELD(signed int simmediate : 16, + ;))))) +}; + +struct fp0_format { /* FPU multiply and add format (MIPS32) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int fmt : 5, + BITFIELD_FIELD(unsigned int ft : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int fd : 5, + BITFIELD_FIELD(unsigned int func : 6, + ;)))))) +}; + +struct mm_fp0_format { /* FPU multipy and add format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int ft : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int fd : 5, + BITFIELD_FIELD(unsigned int fmt : 3, + BITFIELD_FIELD(unsigned int op : 2, + BITFIELD_FIELD(unsigned int func : 6, + ;))))))) +}; + +struct fp1_format { /* FPU mfc1 and cfc1 format (MIPS32) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int op : 5, + BITFIELD_FIELD(unsigned int rt : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int fd : 5, + BITFIELD_FIELD(unsigned int func : 6, + ;)))))) +}; + +struct mm_fp1_format { /* FPU mfc1 and cfc1 format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rt : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int fmt : 2, + BITFIELD_FIELD(unsigned int op : 8, + BITFIELD_FIELD(unsigned int func : 6, + ;)))))) +}; + +struct mm_fp2_format { /* FPU movt and movf format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int fd : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int cc : 3, + BITFIELD_FIELD(unsigned int zero : 2, + BITFIELD_FIELD(unsigned int fmt : 2, + BITFIELD_FIELD(unsigned int op : 3, + BITFIELD_FIELD(unsigned int func : 6, + ;)))))))) +}; + +struct mm_fp3_format { /* FPU abs and neg format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rt : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int fmt : 3, + BITFIELD_FIELD(unsigned int op : 7, + BITFIELD_FIELD(unsigned int func : 6, + ;)))))) +}; + +struct mm_fp4_format { /* FPU c.cond format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rt : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int cc : 3, + BITFIELD_FIELD(unsigned int fmt : 3, + BITFIELD_FIELD(unsigned int cond : 4, + BITFIELD_FIELD(unsigned int func : 6, + ;))))))) +}; + +struct mm_fp5_format { /* FPU lwxc1 and swxc1 format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int index : 5, + BITFIELD_FIELD(unsigned int base : 5, + BITFIELD_FIELD(unsigned int fd : 5, + BITFIELD_FIELD(unsigned int op : 5, + BITFIELD_FIELD(unsigned int func : 6, + ;)))))) +}; + +struct fp6_format { /* FPU madd and msub format (MIPS IV) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int fr : 5, + BITFIELD_FIELD(unsigned int ft : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int fd : 5, + BITFIELD_FIELD(unsigned int func : 6, + ;)))))) +}; + +struct mm_fp6_format { /* FPU madd and msub format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int ft : 5, + BITFIELD_FIELD(unsigned int fs : 5, + BITFIELD_FIELD(unsigned int fd : 5, + BITFIELD_FIELD(unsigned int fr : 5, + BITFIELD_FIELD(unsigned int func : 6, + ;)))))) +}; + +struct mm_i_format { /* Immediate format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rt : 5, + BITFIELD_FIELD(unsigned int rs : 5, + BITFIELD_FIELD(signed int simmediate : 16, + ;)))) +}; + +struct mm_m_format { /* Multi-word load/store format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rd : 5, + BITFIELD_FIELD(unsigned int base : 5, + BITFIELD_FIELD(unsigned int func : 4, + BITFIELD_FIELD(signed int simmediate : 12, + ;))))) +}; + +struct mm_x_format { /* Scaled indexed load format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int index : 5, + BITFIELD_FIELD(unsigned int base : 5, + BITFIELD_FIELD(unsigned int rd : 5, + BITFIELD_FIELD(unsigned int func : 11, + ;))))) +}; + +/* + * microMIPS instruction formats (16-bit length) + */ +struct mm_b0_format { /* Unconditional branch format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(signed int simmediate : 10, + BITFIELD_FIELD(unsigned int : 16, /* Ignored */ + ;))) +}; + +struct mm_b1_format { /* Conditional branch format (microMIPS) */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rs : 3, + BITFIELD_FIELD(signed int simmediate : 7, + BITFIELD_FIELD(unsigned int : 16, /* Ignored */ + ;)))) +}; + +struct mm16_m_format { /* Multi-word load/store format */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int func : 4, + BITFIELD_FIELD(unsigned int rlist : 2, + BITFIELD_FIELD(unsigned int imm : 4, + BITFIELD_FIELD(unsigned int : 16, /* Ignored */ + ;))))) +}; + +struct mm16_rb_format { /* Signed immediate format */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rt : 3, + BITFIELD_FIELD(unsigned int base : 3, + BITFIELD_FIELD(signed int simmediate : 4, + BITFIELD_FIELD(unsigned int : 16, /* Ignored */ + ;))))) +}; + +struct mm16_r3_format { /* Load from global pointer format */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rt : 3, + BITFIELD_FIELD(signed int simmediate : 7, + BITFIELD_FIELD(unsigned int : 16, /* Ignored */ + ;)))) +}; + +struct mm16_r5_format { /* Load/store from stack pointer format */ + BITFIELD_FIELD(unsigned int opcode : 6, + BITFIELD_FIELD(unsigned int rt : 5, + BITFIELD_FIELD(signed int simmediate : 5, + BITFIELD_FIELD(unsigned int : 16, /* Ignored */ + ;)))) +}; + +/* + * MIPS16e instruction formats (16-bit length) + */ +struct m16e_rr { + BITFIELD_FIELD(unsigned int opcode : 5, + BITFIELD_FIELD(unsigned int rx : 3, + BITFIELD_FIELD(unsigned int nd : 1, + BITFIELD_FIELD(unsigned int l : 1, + BITFIELD_FIELD(unsigned int ra : 1, + BITFIELD_FIELD(unsigned int func : 5, + ;)))))) +}; + +struct m16e_jal { + BITFIELD_FIELD(unsigned int opcode : 5, + BITFIELD_FIELD(unsigned int x : 1, + BITFIELD_FIELD(unsigned int imm20_16 : 5, + BITFIELD_FIELD(signed int imm25_21 : 5, + ;)))) +}; + +struct m16e_i64 { + BITFIELD_FIELD(unsigned int opcode : 5, + BITFIELD_FIELD(unsigned int func : 3, + BITFIELD_FIELD(unsigned int imm : 8, + ;))) +}; + +struct m16e_ri64 { + BITFIELD_FIELD(unsigned int opcode : 5, + BITFIELD_FIELD(unsigned int func : 3, + BITFIELD_FIELD(unsigned int ry : 3, + BITFIELD_FIELD(unsigned int imm : 5, + ;)))) +}; + +struct m16e_ri { + BITFIELD_FIELD(unsigned int opcode : 5, + BITFIELD_FIELD(unsigned int rx : 3, + BITFIELD_FIELD(unsigned int imm : 8, + ;))) +}; + +struct m16e_rri { + BITFIELD_FIELD(unsigned int opcode : 5, + BITFIELD_FIELD(unsigned int rx : 3, + BITFIELD_FIELD(unsigned int ry : 3, + BITFIELD_FIELD(unsigned int imm : 5, + ;)))) +}; + +struct m16e_i8 { + BITFIELD_FIELD(unsigned int opcode : 5, + BITFIELD_FIELD(unsigned int func : 3, + BITFIELD_FIELD(unsigned int imm : 8, + ;))) +}; + union mips_instruction { unsigned int word; unsigned short halfword[2]; @@ -326,6 +859,37 @@ union mips_instruction { struct b_format b_format; struct ps_format ps_format; struct v_format v_format; + struct fb_format fb_format; + struct fp0_format fp0_format; + struct mm_fp0_format mm_fp0_format; + struct fp1_format fp1_format; + struct mm_fp1_format mm_fp1_format; + struct mm_fp2_format mm_fp2_format; + struct mm_fp3_format mm_fp3_format; + struct mm_fp4_format mm_fp4_format; + struct mm_fp5_format mm_fp5_format; + struct fp6_format fp6_format; + struct mm_fp6_format mm_fp6_format; + struct mm_i_format mm_i_format; + struct mm_m_format mm_m_format; + struct mm_x_format mm_x_format; + struct mm_b0_format mm_b0_format; + struct mm_b1_format mm_b1_format; + struct mm16_m_format mm16_m_format ; + struct mm16_rb_format mm16_rb_format; + struct mm16_r3_format mm16_r3_format; + struct mm16_r5_format mm16_r5_format; +}; + +union mips16e_instruction { + unsigned int full : 16; + struct m16e_rr rr; + struct m16e_jal jal; + struct m16e_i64 i64; + struct m16e_ri64 ri64; + struct m16e_ri ri; + struct m16e_rri rri; + struct m16e_i8 i8; }; #endif /* _UAPI_ASM_INST_H */ diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile index 520a908d45d6..6ad9e04bdf62 100644 --- a/arch/mips/kernel/Makefile +++ b/arch/mips/kernel/Makefile @@ -5,7 +5,7 @@ extra-y := head.o vmlinux.lds obj-y += cpu-probe.o branch.o entry.o genex.o irq.o process.o \ - ptrace.o reset.o setup.o signal.o syscall.o \ + prom.o ptrace.o reset.o setup.o signal.o syscall.o \ time.o topology.o traps.o unaligned.o watch.o vdso.o ifdef CONFIG_FUNCTION_TRACER @@ -19,15 +19,16 @@ obj-$(CONFIG_CEVT_BCM1480) += cevt-bcm1480.o obj-$(CONFIG_CEVT_R4K) += cevt-r4k.o obj-$(CONFIG_MIPS_MT_SMTC) += cevt-smtc.o obj-$(CONFIG_CEVT_DS1287) += cevt-ds1287.o +obj-$(CONFIG_CEVT_GIC) += cevt-gic.o obj-$(CONFIG_CEVT_GT641XX) += cevt-gt641xx.o obj-$(CONFIG_CEVT_SB1250) += cevt-sb1250.o obj-$(CONFIG_CEVT_TXX9) += cevt-txx9.o obj-$(CONFIG_CSRC_BCM1480) += csrc-bcm1480.o +obj-$(CONFIG_CSRC_GIC) += csrc-gic.o obj-$(CONFIG_CSRC_IOASIC) += csrc-ioasic.o obj-$(CONFIG_CSRC_POWERTV) += csrc-powertv.o obj-$(CONFIG_CSRC_R4K) += csrc-r4k.o obj-$(CONFIG_CSRC_SB1250) += csrc-sb1250.o -obj-$(CONFIG_CSRC_GIC) += csrc-gic.o obj-$(CONFIG_SYNC_R4K) += sync-r4k.o obj-$(CONFIG_STACKTRACE) += stacktrace.o @@ -86,8 +87,6 @@ obj-$(CONFIG_EARLY_PRINTK) += early_printk.o obj-$(CONFIG_SPINLOCK_TEST) += spinlock_test.o obj-$(CONFIG_MIPS_MACHINE) += mips_machine.o -obj-$(CONFIG_OF) += prom.o - CFLAGS_cpu-bugs64.o = $(shell if $(CC) $(KBUILD_CFLAGS) -Wa,-mdaddi -c -o /dev/null -x c /dev/null >/dev/null 2>&1; then echo "-DHAVE_AS_SET_DADDI"; fi) obj-$(CONFIG_HAVE_STD_PC_SERIAL_PORT) += 8250-platform.o diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c index 50285b2c7ffe..0845091ba480 100644 --- a/arch/mips/kernel/asm-offsets.c +++ b/arch/mips/kernel/asm-offsets.c @@ -17,6 +17,8 @@ #include #include +#include + void output_ptreg_defines(void) { COMMENT("MIPS pt_regs offsets."); @@ -328,3 +330,67 @@ void output_pbe_defines(void) BLANK(); } #endif + +void output_kvm_defines(void) +{ + COMMENT(" KVM/MIPS Specfic offsets. "); + DEFINE(VCPU_ARCH_SIZE, sizeof(struct kvm_vcpu_arch)); + OFFSET(VCPU_RUN, kvm_vcpu, run); + OFFSET(VCPU_HOST_ARCH, kvm_vcpu, arch); + + OFFSET(VCPU_HOST_EBASE, kvm_vcpu_arch, host_ebase); + OFFSET(VCPU_GUEST_EBASE, kvm_vcpu_arch, guest_ebase); + + OFFSET(VCPU_HOST_STACK, kvm_vcpu_arch, host_stack); + OFFSET(VCPU_HOST_GP, kvm_vcpu_arch, host_gp); + + OFFSET(VCPU_HOST_CP0_BADVADDR, kvm_vcpu_arch, host_cp0_badvaddr); + OFFSET(VCPU_HOST_CP0_CAUSE, kvm_vcpu_arch, host_cp0_cause); + OFFSET(VCPU_HOST_EPC, kvm_vcpu_arch, host_cp0_epc); + OFFSET(VCPU_HOST_ENTRYHI, kvm_vcpu_arch, host_cp0_entryhi); + + OFFSET(VCPU_GUEST_INST, kvm_vcpu_arch, guest_inst); + + OFFSET(VCPU_R0, kvm_vcpu_arch, gprs[0]); + OFFSET(VCPU_R1, kvm_vcpu_arch, gprs[1]); + OFFSET(VCPU_R2, kvm_vcpu_arch, gprs[2]); + OFFSET(VCPU_R3, kvm_vcpu_arch, gprs[3]); + OFFSET(VCPU_R4, kvm_vcpu_arch, gprs[4]); + OFFSET(VCPU_R5, kvm_vcpu_arch, gprs[5]); + OFFSET(VCPU_R6, kvm_vcpu_arch, gprs[6]); + OFFSET(VCPU_R7, kvm_vcpu_arch, gprs[7]); + OFFSET(VCPU_R8, kvm_vcpu_arch, gprs[8]); + OFFSET(VCPU_R9, kvm_vcpu_arch, gprs[9]); + OFFSET(VCPU_R10, kvm_vcpu_arch, gprs[10]); + OFFSET(VCPU_R11, kvm_vcpu_arch, gprs[11]); + OFFSET(VCPU_R12, kvm_vcpu_arch, gprs[12]); + OFFSET(VCPU_R13, kvm_vcpu_arch, gprs[13]); + OFFSET(VCPU_R14, kvm_vcpu_arch, gprs[14]); + OFFSET(VCPU_R15, kvm_vcpu_arch, gprs[15]); + OFFSET(VCPU_R16, kvm_vcpu_arch, gprs[16]); + OFFSET(VCPU_R17, kvm_vcpu_arch, gprs[17]); + OFFSET(VCPU_R18, kvm_vcpu_arch, gprs[18]); + OFFSET(VCPU_R19, kvm_vcpu_arch, gprs[19]); + OFFSET(VCPU_R20, kvm_vcpu_arch, gprs[20]); + OFFSET(VCPU_R21, kvm_vcpu_arch, gprs[21]); + OFFSET(VCPU_R22, kvm_vcpu_arch, gprs[22]); + OFFSET(VCPU_R23, kvm_vcpu_arch, gprs[23]); + OFFSET(VCPU_R24, kvm_vcpu_arch, gprs[24]); + OFFSET(VCPU_R25, kvm_vcpu_arch, gprs[25]); + OFFSET(VCPU_R26, kvm_vcpu_arch, gprs[26]); + OFFSET(VCPU_R27, kvm_vcpu_arch, gprs[27]); + OFFSET(VCPU_R28, kvm_vcpu_arch, gprs[28]); + OFFSET(VCPU_R29, kvm_vcpu_arch, gprs[29]); + OFFSET(VCPU_R30, kvm_vcpu_arch, gprs[30]); + OFFSET(VCPU_R31, kvm_vcpu_arch, gprs[31]); + OFFSET(VCPU_LO, kvm_vcpu_arch, lo); + OFFSET(VCPU_HI, kvm_vcpu_arch, hi); + OFFSET(VCPU_PC, kvm_vcpu_arch, pc); + OFFSET(VCPU_COP0, kvm_vcpu_arch, cop0); + OFFSET(VCPU_GUEST_KERNEL_ASID, kvm_vcpu_arch, guest_kernel_asid); + OFFSET(VCPU_GUEST_USER_ASID, kvm_vcpu_arch, guest_user_asid); + + OFFSET(COP0_TLB_HI, mips_coproc, reg[MIPS_CP0_TLB_HI][0]); + OFFSET(COP0_STATUS, mips_coproc, reg[MIPS_CP0_STATUS][0]); + BLANK(); +} diff --git a/arch/mips/kernel/binfmt_elfo32.c b/arch/mips/kernel/binfmt_elfo32.c index 556a4357d7fc..97c5a1668e53 100644 --- a/arch/mips/kernel/binfmt_elfo32.c +++ b/arch/mips/kernel/binfmt_elfo32.c @@ -48,7 +48,11 @@ typedef elf_fpreg_t elf_fpregset_t[ELF_NFPREG]; __res; \ }) +#ifdef CONFIG_KVM_GUEST +#define TASK32_SIZE 0x3fff8000UL +#else #define TASK32_SIZE 0x7fff8000UL +#endif #undef ELF_ET_DYN_BASE #define ELF_ET_DYN_BASE (TASK32_SIZE / 3 * 2) diff --git a/arch/mips/kernel/branch.c b/arch/mips/kernel/branch.c index 83ffe950f710..46c2ad0703a0 100644 --- a/arch/mips/kernel/branch.c +++ b/arch/mips/kernel/branch.c @@ -14,10 +14,186 @@ #include #include #include +#include #include #include #include +/* + * Calculate and return exception PC in case of branch delay slot + * for microMIPS and MIPS16e. It does not clear the ISA mode bit. + */ +int __isa_exception_epc(struct pt_regs *regs) +{ + unsigned short inst; + long epc = regs->cp0_epc; + + /* Calculate exception PC in branch delay slot. */ + if (__get_user(inst, (u16 __user *) msk_isa16_mode(epc))) { + /* This should never happen because delay slot was checked. */ + force_sig(SIGSEGV, current); + return epc; + } + if (cpu_has_mips16) { + if (((union mips16e_instruction)inst).ri.opcode + == MIPS16e_jal_op) + epc += 4; + else + epc += 2; + } else if (mm_insn_16bit(inst)) + epc += 2; + else + epc += 4; + + return epc; +} + +/* + * Compute return address and emulate branch in microMIPS mode after an + * exception only. It does not handle compact branches/jumps and cannot + * be used in interrupt context. (Compact branches/jumps do not cause + * exceptions.) + */ +int __microMIPS_compute_return_epc(struct pt_regs *regs) +{ + u16 __user *pc16; + u16 halfword; + unsigned int word; + unsigned long contpc; + struct mm_decoded_insn mminsn = { 0 }; + + mminsn.micro_mips_mode = 1; + + /* This load never faults. */ + pc16 = (unsigned short __user *)msk_isa16_mode(regs->cp0_epc); + __get_user(halfword, pc16); + pc16++; + contpc = regs->cp0_epc + 2; + word = ((unsigned int)halfword << 16); + mminsn.pc_inc = 2; + + if (!mm_insn_16bit(halfword)) { + __get_user(halfword, pc16); + pc16++; + contpc = regs->cp0_epc + 4; + mminsn.pc_inc = 4; + word |= halfword; + } + mminsn.insn = word; + + if (get_user(halfword, pc16)) + goto sigsegv; + mminsn.next_pc_inc = 2; + word = ((unsigned int)halfword << 16); + + if (!mm_insn_16bit(halfword)) { + pc16++; + if (get_user(halfword, pc16)) + goto sigsegv; + mminsn.next_pc_inc = 4; + word |= halfword; + } + mminsn.next_insn = word; + + mm_isBranchInstr(regs, mminsn, &contpc); + + regs->cp0_epc = contpc; + + return 0; + +sigsegv: + force_sig(SIGSEGV, current); + return -EFAULT; +} + +/* + * Compute return address and emulate branch in MIPS16e mode after an + * exception only. It does not handle compact branches/jumps and cannot + * be used in interrupt context. (Compact branches/jumps do not cause + * exceptions.) + */ +int __MIPS16e_compute_return_epc(struct pt_regs *regs) +{ + u16 __user *addr; + union mips16e_instruction inst; + u16 inst2; + u32 fullinst; + long epc; + + epc = regs->cp0_epc; + + /* Read the instruction. */ + addr = (u16 __user *)msk_isa16_mode(epc); + if (__get_user(inst.full, addr)) { + force_sig(SIGSEGV, current); + return -EFAULT; + } + + switch (inst.ri.opcode) { + case MIPS16e_extend_op: + regs->cp0_epc += 4; + return 0; + + /* + * JAL and JALX in MIPS16e mode + */ + case MIPS16e_jal_op: + addr += 1; + if (__get_user(inst2, addr)) { + force_sig(SIGSEGV, current); + return -EFAULT; + } + fullinst = ((unsigned)inst.full << 16) | inst2; + regs->regs[31] = epc + 6; + epc += 4; + epc >>= 28; + epc <<= 28; + /* + * JAL:5 X:1 TARGET[20-16]:5 TARGET[25:21]:5 TARGET[15:0]:16 + * + * ......TARGET[15:0].................TARGET[20:16]........... + * ......TARGET[25:21] + */ + epc |= + ((fullinst & 0xffff) << 2) | ((fullinst & 0x3e00000) >> 3) | + ((fullinst & 0x1f0000) << 7); + if (!inst.jal.x) + set_isa16_mode(epc); /* Set ISA mode bit. */ + regs->cp0_epc = epc; + return 0; + + /* + * J(AL)R(C) + */ + case MIPS16e_rr_op: + if (inst.rr.func == MIPS16e_jr_func) { + + if (inst.rr.ra) + regs->cp0_epc = regs->regs[31]; + else + regs->cp0_epc = + regs->regs[reg16to32[inst.rr.rx]]; + + if (inst.rr.l) { + if (inst.rr.nd) + regs->regs[31] = epc + 2; + else + regs->regs[31] = epc + 4; + } + return 0; + } + break; + } + + /* + * All other cases have no branch delay slot and are 16-bits. + * Branches do not cause an exception. + */ + regs->cp0_epc += 2; + + return 0; +} + /** * __compute_return_epc_for_insn - Computes the return address and do emulate * branch simulation, if required. @@ -129,6 +305,8 @@ int __compute_return_epc_for_insn(struct pt_regs *regs, epc <<= 28; epc |= (insn.j_format.target << 2); regs->cp0_epc = epc; + if (insn.i_format.opcode == jalx_op) + set_isa16_mode(regs->cp0_epc); break; /* diff --git a/arch/mips/kernel/cevt-gic.c b/arch/mips/kernel/cevt-gic.c new file mode 100644 index 000000000000..730eaf92c018 --- /dev/null +++ b/arch/mips/kernel/cevt-gic.c @@ -0,0 +1,104 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2013 Imagination Technologies Ltd. + */ +#include +#include +#include +#include +#include + +#include +#include +#include + +DEFINE_PER_CPU(struct clock_event_device, gic_clockevent_device); +int gic_timer_irq_installed; + + +static int gic_next_event(unsigned long delta, struct clock_event_device *evt) +{ + u64 cnt; + int res; + + cnt = gic_read_count(); + cnt += (u64)delta; + gic_write_compare(cnt); + res = ((int)(gic_read_count() - cnt) >= 0) ? -ETIME : 0; + return res; +} + +void gic_set_clock_mode(enum clock_event_mode mode, + struct clock_event_device *evt) +{ + /* Nothing to do ... */ +} + +irqreturn_t gic_compare_interrupt(int irq, void *dev_id) +{ + struct clock_event_device *cd; + int cpu = smp_processor_id(); + + gic_write_compare(gic_read_compare()); + cd = &per_cpu(gic_clockevent_device, cpu); + cd->event_handler(cd); + return IRQ_HANDLED; +} + +struct irqaction gic_compare_irqaction = { + .handler = gic_compare_interrupt, + .flags = IRQF_PERCPU | IRQF_TIMER, + .name = "timer", +}; + + +void gic_event_handler(struct clock_event_device *dev) +{ +} + +int __cpuinit gic_clockevent_init(void) +{ + unsigned int cpu = smp_processor_id(); + struct clock_event_device *cd; + unsigned int irq; + + if (!cpu_has_counter || !gic_frequency) + return -ENXIO; + + irq = MIPS_GIC_IRQ_BASE; + + cd = &per_cpu(gic_clockevent_device, cpu); + + cd->name = "MIPS GIC"; + cd->features = CLOCK_EVT_FEAT_ONESHOT; + + clockevent_set_clock(cd, gic_frequency); + + /* Calculate the min / max delta */ + cd->max_delta_ns = clockevent_delta2ns(0x7fffffff, cd); + cd->min_delta_ns = clockevent_delta2ns(0x300, cd); + + cd->rating = 300; + cd->irq = irq; + cd->cpumask = cpumask_of(cpu); + cd->set_next_event = gic_next_event; + cd->set_mode = gic_set_clock_mode; + cd->event_handler = gic_event_handler; + + clockevents_register_device(cd); + + GICWRITE(GIC_REG(VPE_LOCAL, GIC_VPE_COMPARE_MAP), 0x80000002); + GICWRITE(GIC_REG(VPE_LOCAL, GIC_VPE_SMASK), GIC_VPE_SMASK_CMP_MSK); + + if (gic_timer_irq_installed) + return 0; + + gic_timer_irq_installed = 1; + + setup_irq(irq, &gic_compare_irqaction); + irq_set_handler(irq, handle_percpu_irq); + return 0; +} diff --git a/arch/mips/kernel/cevt-r4k.c b/arch/mips/kernel/cevt-r4k.c index 07b847d77f5d..02033eaf8825 100644 --- a/arch/mips/kernel/cevt-r4k.c +++ b/arch/mips/kernel/cevt-r4k.c @@ -23,7 +23,6 @@ */ #ifndef CONFIG_MIPS_MT_SMTC - static int mips_next_event(unsigned long delta, struct clock_event_device *evt) { @@ -49,7 +48,6 @@ DEFINE_PER_CPU(struct clock_event_device, mips_clockevent_device); int cp0_timer_irq_installed; #ifndef CONFIG_MIPS_MT_SMTC - irqreturn_t c0_compare_interrupt(int irq, void *dev_id) { const int r2 = cpu_has_mips_r2; @@ -74,6 +72,9 @@ irqreturn_t c0_compare_interrupt(int irq, void *dev_id) /* Clear Count/Compare Interrupt */ write_c0_compare(read_c0_compare()); cd = &per_cpu(mips_clockevent_device, cpu); +#ifdef CONFIG_CEVT_GIC + if (!gic_present) +#endif cd->event_handler(cd); } @@ -118,6 +119,10 @@ int c0_compare_int_usable(void) unsigned int delta; unsigned int cnt; +#ifdef CONFIG_KVM_GUEST + return 1; +#endif + /* * IP7 already pending? Try to clear it by acking the timer. */ @@ -166,7 +171,6 @@ int c0_compare_int_usable(void) } #ifndef CONFIG_MIPS_MT_SMTC - int __cpuinit r4k_clockevent_init(void) { unsigned int cpu = smp_processor_id(); @@ -206,6 +210,9 @@ int __cpuinit r4k_clockevent_init(void) cd->set_mode = mips_set_clock_mode; cd->event_handler = mips_event_handler; +#ifdef CONFIG_CEVT_GIC + if (!gic_present) +#endif clockevents_register_device(cd); if (cp0_timer_irq_installed) diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c index 5fe66a0c3224..4bbffdb9024f 100644 --- a/arch/mips/kernel/cpu-probe.c +++ b/arch/mips/kernel/cpu-probe.c @@ -470,6 +470,9 @@ static inline unsigned int decode_config3(struct cpuinfo_mips *c) c->options |= MIPS_CPU_ULRI; if (config3 & MIPS_CONF3_ISA) c->options |= MIPS_CPU_MICROMIPS; +#ifdef CONFIG_CPU_MICROMIPS + write_c0_config3(read_c0_config3() | MIPS_CONF3_ISA_OE); +#endif if (config3 & MIPS_CONF3_VZ) c->ases |= MIPS_ASE_VZ; diff --git a/arch/mips/kernel/csrc-gic.c b/arch/mips/kernel/csrc-gic.c index 5dca24bce51b..e02620901117 100644 --- a/arch/mips/kernel/csrc-gic.c +++ b/arch/mips/kernel/csrc-gic.c @@ -5,23 +5,14 @@ * * Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. */ -#include #include +#include -#include #include static cycle_t gic_hpt_read(struct clocksource *cs) { - unsigned int hi, hi2, lo; - - do { - GICREAD(GIC_REG(SHARED, GIC_SH_COUNTER_63_32), hi); - GICREAD(GIC_REG(SHARED, GIC_SH_COUNTER_31_00), lo); - GICREAD(GIC_REG(SHARED, GIC_SH_COUNTER_63_32), hi2); - } while (hi2 != hi); - - return (((cycle_t) hi) << 32) + lo; + return gic_read_count(); } static struct clocksource gic_clocksource = { diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S index ecb347ce1b3d..5c2ba9f08a80 100644 --- a/arch/mips/kernel/genex.S +++ b/arch/mips/kernel/genex.S @@ -5,8 +5,8 @@ * * Copyright (C) 1994 - 2000, 2001, 2003 Ralf Baechle * Copyright (C) 1999, 2000 Silicon Graphics, Inc. - * Copyright (C) 2001 MIPS Technologies, Inc. * Copyright (C) 2002, 2007 Maciej W. Rozycki + * Copyright (C) 2001, 2012 MIPS Technologies, Inc. All rights reserved. */ #include @@ -21,8 +21,10 @@ #include #include +#ifdef CONFIG_MIPS_MT_SMTC #define PANIC_PIC(msg) \ - .set push; \ + .set push; \ + .set nomicromips; \ .set reorder; \ PTR_LA a0,8f; \ .set noat; \ @@ -31,17 +33,10 @@ 9: b 9b; \ .set pop; \ TEXT(msg) +#endif __INIT -NESTED(except_vec0_generic, 0, sp) - PANIC_PIC("Exception vector 0 called") - END(except_vec0_generic) - -NESTED(except_vec1_generic, 0, sp) - PANIC_PIC("Exception vector 1 called") - END(except_vec1_generic) - /* * General exception vector for all other CPUs. * @@ -138,12 +133,19 @@ LEAF(r4k_wait) nop nop nop +#ifdef CONFIG_CPU_MICROMIPS + nop + nop + nop + nop +#endif .set mips3 wait /* end of rollback region (the region size must be power of two) */ - .set pop 1: jr ra + nop + .set pop END(r4k_wait) .macro BUILD_ROLLBACK_PROLOGUE handler @@ -201,7 +203,11 @@ NESTED(handle_int, PT_SIZE, sp) LONG_L s0, TI_REGS($28) LONG_S sp, TI_REGS($28) PTR_LA ra, ret_from_irq - j plat_irq_dispatch + PTR_LA v0, plat_irq_dispatch + jr v0 +#ifdef CONFIG_CPU_MICROMIPS + nop +#endif END(handle_int) __INIT @@ -222,11 +228,14 @@ NESTED(except_vec4, 0, sp) /* * EJTAG debug exception handler. * The EJTAG debug exception entry point is 0xbfc00480, which - * normally is in the boot PROM, so the boot PROM must do a + * normally is in the boot PROM, so the boot PROM must do an * unconditional jump to this vector. */ NESTED(except_vec_ejtag_debug, 0, sp) j ejtag_debug_handler +#ifdef CONFIG_CPU_MICROMIPS + nop +#endif END(except_vec_ejtag_debug) __FINIT @@ -251,9 +260,10 @@ NESTED(except_vec_vi, 0, sp) FEXPORT(except_vec_vi_mori) ori a0, $0, 0 #endif /* CONFIG_MIPS_MT_SMTC */ + PTR_LA v1, except_vec_vi_handler FEXPORT(except_vec_vi_lui) lui v0, 0 /* Patched */ - j except_vec_vi_handler + jr v1 FEXPORT(except_vec_vi_ori) ori v0, 0 /* Patched */ .set pop @@ -354,6 +364,9 @@ EXPORT(ejtag_debug_buffer) */ NESTED(except_vec_nmi, 0, sp) j nmi_handler +#ifdef CONFIG_CPU_MICROMIPS + nop +#endif END(except_vec_nmi) __FINIT @@ -480,7 +493,7 @@ NESTED(nmi_handler, PT_SIZE, sp) .set noreorder /* check if TLB contains a entry for EPC */ MFC0 k1, CP0_ENTRYHI - andi k1, 0xff /* ASID_MASK */ + andi k1, 0xff /* ASID_MASK patched at run-time!! */ MFC0 k0, CP0_EPC PTR_SRL k0, _PAGE_SHIFT + 1 PTR_SLL k0, _PAGE_SHIFT + 1 @@ -500,13 +513,35 @@ NESTED(nmi_handler, PT_SIZE, sp) .set push .set noat .set noreorder - /* 0x7c03e83b: rdhwr v1,$29 */ + /* MIPS32: 0x7c03e83b: rdhwr v1,$29 */ + /* microMIPS: 0x007d6b3c: rdhwr v1,$29 */ MFC0 k1, CP0_EPC - lui k0, 0x7c03 - lw k1, (k1) - ori k0, 0xe83b - .set reorder +#if defined(CONFIG_CPU_MICROMIPS) || defined(CONFIG_CPU_MIPS32_R2) || defined(CONFIG_CPU_MIPS64_R2) + and k0, k1, 1 + beqz k0, 1f + xor k1, k0 + lhu k0, (k1) + lhu k1, 2(k1) + ins k1, k0, 16, 16 + lui k0, 0x007d + b docheck + ori k0, 0x6b3c +1: + lui k0, 0x7c03 + lw k1, (k1) + ori k0, 0xe83b +#else + andi k0, k1, 1 + bnez k0, handle_ri + lui k0, 0x7c03 + lw k1, (k1) + ori k0, 0xe83b +#endif + .set reorder +docheck: bne k0, k1, handle_ri /* if not ours */ + +isrdhwr: /* The insn is rdhwr. No need to check CAUSE.BD here. */ get_saved_sp /* k1 := current_thread_info */ .set noreorder diff --git a/arch/mips/kernel/irq-gic.c b/arch/mips/kernel/irq-gic.c index 485e6a961b31..c01b307317a9 100644 --- a/arch/mips/kernel/irq-gic.c +++ b/arch/mips/kernel/irq-gic.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include @@ -19,6 +20,8 @@ #include #include +unsigned int gic_frequency; +unsigned int gic_present; unsigned long _gic_base; unsigned int gic_irq_base; unsigned int gic_irq_flags[GIC_NUM_INTRS]; @@ -30,6 +33,39 @@ static struct gic_pcpu_mask pcpu_masks[NR_CPUS]; static struct gic_pending_regs pending_regs[NR_CPUS]; static struct gic_intrmask_regs intrmask_regs[NR_CPUS]; +#if defined(CONFIG_CSRC_GIC) || defined(CONFIG_CEVT_GIC) +cycle_t gic_read_count(void) +{ + unsigned int hi, hi2, lo; + + do { + GICREAD(GIC_REG(SHARED, GIC_SH_COUNTER_63_32), hi); + GICREAD(GIC_REG(SHARED, GIC_SH_COUNTER_31_00), lo); + GICREAD(GIC_REG(SHARED, GIC_SH_COUNTER_63_32), hi2); + } while (hi2 != hi); + + return (((cycle_t) hi) << 32) + lo; +} + +void gic_write_compare(cycle_t cnt) +{ + GICWRITE(GIC_REG(VPE_LOCAL, GIC_VPE_COMPARE_HI), + (int)(cnt >> 32)); + GICWRITE(GIC_REG(VPE_LOCAL, GIC_VPE_COMPARE_LO), + (int)(cnt & 0xffffffff)); +} + +cycle_t gic_read_compare(void) +{ + unsigned int hi, lo; + + GICREAD(GIC_REG(VPE_LOCAL, GIC_VPE_COMPARE_HI), hi); + GICREAD(GIC_REG(VPE_LOCAL, GIC_VPE_COMPARE_LO), lo); + + return (((cycle_t) hi) << 32) + lo; +} +#endif + unsigned int gic_get_timer_pending(void) { unsigned int vpe_pending; @@ -116,6 +152,17 @@ static void __init vpe_local_setup(unsigned int numvpes) } } +unsigned int gic_compare_int(void) +{ + unsigned int pending; + + GICREAD(GIC_REG(VPE_LOCAL, GIC_VPE_PEND), pending); + if (pending & GIC_VPE_PEND_CMP_MSK) + return 1; + else + return 0; +} + unsigned int gic_get_int(void) { unsigned int i; diff --git a/arch/mips/kernel/mips_machine.c b/arch/mips/kernel/mips_machine.c index 411a058d2c53..876097529697 100644 --- a/arch/mips/kernel/mips_machine.c +++ b/arch/mips/kernel/mips_machine.c @@ -11,9 +11,9 @@ #include #include +#include static struct mips_machine *mips_machine __initdata; -static char *mips_machine_name = "Unknown"; #define for_each_machine(mach) \ for ((mach) = (struct mips_machine *)&__mips_machines_start; \ @@ -21,25 +21,6 @@ static char *mips_machine_name = "Unknown"; (unsigned long)(mach) < (unsigned long)&__mips_machines_end; \ (mach)++) -__init void mips_set_machine_name(const char *name) -{ - char *p; - - if (name == NULL) - return; - - p = kstrdup(name, GFP_KERNEL); - if (!p) - pr_err("MIPS: no memory for machine_name\n"); - - mips_machine_name = p; -} - -char *mips_get_machine_name(void) -{ - return mips_machine_name; -} - __init int mips_machtype_setup(char *id) { struct mips_machine *mach; @@ -79,7 +60,6 @@ __init void mips_machine_setup(void) return; mips_set_machine_name(mips_machine->mach_name); - pr_info("MIPS: machine is %s\n", mips_machine_name); if (mips_machine->mach_setup) mips_machine->mach_setup(); diff --git a/arch/mips/kernel/proc.c b/arch/mips/kernel/proc.c index 7a54f74b7818..a3e461408b7e 100644 --- a/arch/mips/kernel/proc.c +++ b/arch/mips/kernel/proc.c @@ -12,7 +12,7 @@ #include #include #include -#include +#include unsigned int vced_count, vcei_count; @@ -99,6 +99,10 @@ static int show_cpuinfo(struct seq_file *m, void *v) if (cpu_has_vz) seq_printf(m, "%s", " vz"); seq_printf(m, "\n"); + if (cpu_has_mmips) { + seq_printf(m, "micromips kernel\t: %s\n", + (read_c0_config3() & MIPS_CONF3_ISA_OE) ? "yes" : "no"); + } seq_printf(m, "shadow register sets\t: %d\n", cpu_data[n].srsets); seq_printf(m, "kscratch registers\t: %d\n", diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index cfc742d75b7f..eb902c1f0cad 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -7,6 +7,7 @@ * Copyright (C) 2005, 2006 by Ralf Baechle (ralf@linux-mips.org) * Copyright (C) 1999, 2000 Silicon Graphics, Inc. * Copyright (C) 2004 Thiemo Seufer + * Copyright (C) 2013 Imagination Technologies Ltd. */ #include #include @@ -225,34 +226,115 @@ struct mips_frame_info { static inline int is_ra_save_ins(union mips_instruction *ip) { +#ifdef CONFIG_CPU_MICROMIPS + union mips_instruction mmi; + + /* + * swsp ra,offset + * swm16 reglist,offset(sp) + * swm32 reglist,offset(sp) + * sw32 ra,offset(sp) + * jradiussp - NOT SUPPORTED + * + * microMIPS is way more fun... + */ + if (mm_insn_16bit(ip->halfword[0])) { + mmi.word = (ip->halfword[0] << 16); + return ((mmi.mm16_r5_format.opcode == mm_swsp16_op && + mmi.mm16_r5_format.rt == 31) || + (mmi.mm16_m_format.opcode == mm_pool16c_op && + mmi.mm16_m_format.func == mm_swm16_op)); + } + else { + mmi.halfword[0] = ip->halfword[1]; + mmi.halfword[1] = ip->halfword[0]; + return ((mmi.mm_m_format.opcode == mm_pool32b_op && + mmi.mm_m_format.rd > 9 && + mmi.mm_m_format.base == 29 && + mmi.mm_m_format.func == mm_swm32_func) || + (mmi.i_format.opcode == mm_sw32_op && + mmi.i_format.rs == 29 && + mmi.i_format.rt == 31)); + } +#else /* sw / sd $ra, offset($sp) */ return (ip->i_format.opcode == sw_op || ip->i_format.opcode == sd_op) && ip->i_format.rs == 29 && ip->i_format.rt == 31; +#endif } static inline int is_jal_jalr_jr_ins(union mips_instruction *ip) { +#ifdef CONFIG_CPU_MICROMIPS + /* + * jr16,jrc,jalr16,jalr16 + * jal + * jalr/jr,jalr.hb/jr.hb,jalrs,jalrs.hb + * jraddiusp - NOT SUPPORTED + * + * microMIPS is kind of more fun... + */ + union mips_instruction mmi; + + mmi.word = (ip->halfword[0] << 16); + + if ((mmi.mm16_r5_format.opcode == mm_pool16c_op && + (mmi.mm16_r5_format.rt & mm_jr16_op) == mm_jr16_op) || + ip->j_format.opcode == mm_jal32_op) + return 1; + if (ip->r_format.opcode != mm_pool32a_op || + ip->r_format.func != mm_pool32axf_op) + return 0; + return (((ip->u_format.uimmediate >> 6) & mm_jalr_op) == mm_jalr_op); +#else if (ip->j_format.opcode == jal_op) return 1; if (ip->r_format.opcode != spec_op) return 0; return ip->r_format.func == jalr_op || ip->r_format.func == jr_op; +#endif } static inline int is_sp_move_ins(union mips_instruction *ip) { +#ifdef CONFIG_CPU_MICROMIPS + /* + * addiusp -imm + * addius5 sp,-imm + * addiu32 sp,sp,-imm + * jradiussp - NOT SUPPORTED + * + * microMIPS is not more fun... + */ + if (mm_insn_16bit(ip->halfword[0])) { + union mips_instruction mmi; + + mmi.word = (ip->halfword[0] << 16); + return ((mmi.mm16_r3_format.opcode == mm_pool16d_op && + mmi.mm16_r3_format.simmediate && mm_addiusp_func) || + (mmi.mm16_r5_format.opcode == mm_pool16d_op && + mmi.mm16_r5_format.rt == 29)); + } + return (ip->mm_i_format.opcode == mm_addiu32_op && + ip->mm_i_format.rt == 29 && ip->mm_i_format.rs == 29); +#else /* addiu/daddiu sp,sp,-imm */ if (ip->i_format.rs != 29 || ip->i_format.rt != 29) return 0; if (ip->i_format.opcode == addiu_op || ip->i_format.opcode == daddiu_op) return 1; +#endif return 0; } static int get_frame_info(struct mips_frame_info *info) { +#ifdef CONFIG_CPU_MICROMIPS + union mips_instruction *ip = (void *) (((char *) info->func) - 1); +#else union mips_instruction *ip = info->func; +#endif unsigned max_insns = info->func_size / sizeof(union mips_instruction); unsigned i; @@ -272,7 +354,26 @@ static int get_frame_info(struct mips_frame_info *info) break; if (!info->frame_size) { if (is_sp_move_ins(ip)) + { +#ifdef CONFIG_CPU_MICROMIPS + if (mm_insn_16bit(ip->halfword[0])) + { + unsigned short tmp; + + if (ip->halfword[0] & mm_addiusp_func) + { + tmp = (((ip->halfword[0] >> 1) & 0x1ff) << 2); + info->frame_size = -(signed short)(tmp | ((tmp & 0x100) ? 0xfe00 : 0)); + } else { + tmp = (ip->halfword[0] >> 1); + info->frame_size = -(signed short)(tmp & 0xf); + } + ip = (void *) &ip->halfword[1]; + ip--; + } else +#endif info->frame_size = - ip->i_format.simmediate; + } continue; } if (info->pc_offset == -1 && is_ra_save_ins(ip)) { diff --git a/arch/mips/kernel/prom.c b/arch/mips/kernel/prom.c index 028f6f837ef9..5712bb532245 100644 --- a/arch/mips/kernel/prom.c +++ b/arch/mips/kernel/prom.c @@ -23,6 +23,23 @@ #include #include +static char mips_machine_name[64] = "Unknown"; + +__init void mips_set_machine_name(const char *name) +{ + if (name == NULL) + return; + + strncpy(mips_machine_name, name, sizeof(mips_machine_name)); + pr_info("MIPS: machine is %s\n", mips_get_machine_name()); +} + +char *mips_get_machine_name(void) +{ + return mips_machine_name; +} + +#ifdef CONFIG_OF int __init early_init_dt_scan_memory_arch(unsigned long node, const char *uname, int depth, void *data) @@ -50,6 +67,18 @@ void __init early_init_dt_setup_initrd_arch(unsigned long start, } #endif +int __init early_init_dt_scan_model(unsigned long node, const char *uname, + int depth, void *data) +{ + if (!depth) { + char *model = of_get_flat_dt_prop(node, "model", NULL); + + if (model) + mips_set_machine_name(model); + } + return 0; +} + void __init early_init_devtree(void *params) { /* Setup flat device-tree pointer */ @@ -65,6 +94,9 @@ void __init early_init_devtree(void *params) /* Scan memory nodes */ of_scan_flat_dt(early_init_dt_scan_root, NULL); of_scan_flat_dt(early_init_dt_scan_memory_arch, NULL); + + /* try to load the mips machine name */ + of_scan_flat_dt(early_init_dt_scan_model, NULL); } void __init __dt_setup_arch(struct boot_param_header *bph) @@ -79,3 +111,4 @@ void __init __dt_setup_arch(struct boot_param_header *bph) early_init_devtree(initial_boot_params); } +#endif diff --git a/arch/mips/kernel/scall32-o32.S b/arch/mips/kernel/scall32-o32.S index 9ea29649fc28..9b36424b03c5 100644 --- a/arch/mips/kernel/scall32-o32.S +++ b/arch/mips/kernel/scall32-o32.S @@ -138,9 +138,18 @@ stackargs: 5: jr t1 sw t5, 16(sp) # argument #5 to ksp +#ifdef CONFIG_CPU_MICROMIPS sw t8, 28(sp) # argument #8 to ksp + nop sw t7, 24(sp) # argument #7 to ksp + nop sw t6, 20(sp) # argument #6 to ksp + nop +#else + sw t8, 28(sp) # argument #8 to ksp + sw t7, 24(sp) # argument #7 to ksp + sw t6, 20(sp) # argument #6 to ksp +#endif 6: j stack_done # go back nop .set pop diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c index 4c774d5d5087..c7f90519e58c 100644 --- a/arch/mips/kernel/setup.c +++ b/arch/mips/kernel/setup.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include @@ -77,6 +78,8 @@ EXPORT_SYMBOL(mips_io_port_base); static struct resource code_resource = { .name = "Kernel code", }; static struct resource data_resource = { .name = "Kernel data", }; +static void *detect_magic __initdata = detect_memory_region; + void __init add_memory_region(phys_t start, phys_t size, long type) { int x = boot_mem_map.nr_map; @@ -122,6 +125,25 @@ void __init add_memory_region(phys_t start, phys_t size, long type) boot_mem_map.nr_map++; } +void __init detect_memory_region(phys_t start, phys_t sz_min, phys_t sz_max) +{ + void *dm = &detect_magic; + phys_t size; + + for (size = sz_min; size < sz_max; size <<= 1) { + if (!memcmp(dm, dm + size, sizeof(detect_magic))) + break; + } + + pr_debug("Memory: %lluMB of RAM detected at 0x%llx (min: %lluMB, max: %lluMB)\n", + ((unsigned long long) size) / SZ_1M, + (unsigned long long) start, + ((unsigned long long) sz_min) / SZ_1M, + ((unsigned long long) sz_max) / SZ_1M); + + add_memory_region(start, size, BOOT_MEM_RAM); +} + static void __init print_memory_map(void) { int i; diff --git a/arch/mips/kernel/signal.c b/arch/mips/kernel/signal.c index b5e88fd83277..fd3ef2c2afbc 100644 --- a/arch/mips/kernel/signal.c +++ b/arch/mips/kernel/signal.c @@ -35,6 +35,7 @@ #include #include #include +#include #include "signal-common.h" @@ -480,7 +481,15 @@ static void handle_signal(unsigned long sig, siginfo_t *info, sigset_t *oldset = sigmask_to_save(); int ret; struct mips_abi *abi = current->thread.abi; +#ifdef CONFIG_CPU_MICROMIPS + void *vdso; + unsigned int tmp = (unsigned int)current->mm->context.vdso; + + set_isa16_mode(tmp); + vdso = (void *)tmp; +#else void *vdso = current->mm->context.vdso; +#endif if (regs->regs[0]) { switch(regs->regs[2]) { diff --git a/arch/mips/kernel/smp-mt.c b/arch/mips/kernel/smp-mt.c index bfede063d96a..3e5164c11cac 100644 --- a/arch/mips/kernel/smp-mt.c +++ b/arch/mips/kernel/smp-mt.c @@ -34,6 +34,7 @@ #include #include #include +#include static void __init smvp_copy_vpe_config(void) { @@ -151,8 +152,6 @@ static void vsmp_send_ipi_mask(const struct cpumask *mask, unsigned int action) static void __cpuinit vsmp_init_secondary(void) { #ifdef CONFIG_IRQ_GIC - extern int gic_present; - /* This is Malta specific: IPI,performance and timer interrupts */ if (gic_present) change_c0_status(ST0_IM, STATUSF_IP3 | STATUSF_IP4 | diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c index aee04af213c5..c17619fe18e3 100644 --- a/arch/mips/kernel/smp.c +++ b/arch/mips/kernel/smp.c @@ -83,6 +83,7 @@ static inline void set_cpu_sibling_map(int cpu) } struct plat_smp_ops *mp_ops; +EXPORT_SYMBOL(mp_ops); __cpuinit void register_smp_ops(struct plat_smp_ops *ops) { diff --git a/arch/mips/kernel/smtc-asm.S b/arch/mips/kernel/smtc-asm.S index 76016ac0a9c8..2866863a39df 100644 --- a/arch/mips/kernel/smtc-asm.S +++ b/arch/mips/kernel/smtc-asm.S @@ -49,6 +49,9 @@ CAN WE PROVE THAT WE WON'T DO THIS IF INTS DISABLED?? .text .align 5 FEXPORT(__smtc_ipi_vector) +#ifdef CONFIG_CPU_MICROMIPS + nop +#endif .set noat /* Disable thread scheduling to make Status update atomic */ DMT 27 # dmt k1 diff --git a/arch/mips/kernel/smtc.c b/arch/mips/kernel/smtc.c index 7186222dc5bb..31d22f3121c9 100644 --- a/arch/mips/kernel/smtc.c +++ b/arch/mips/kernel/smtc.c @@ -111,7 +111,7 @@ static int vpe0limit; static int ipibuffers; static int nostlb; static int asidmask; -unsigned long smtc_asid_mask = 0xff; +unsigned int smtc_asid_mask = 0xff; static int __init vpe0tcs(char *str) { @@ -1395,7 +1395,7 @@ void smtc_get_new_mmu_context(struct mm_struct *mm, unsigned long cpu) asid = asid_cache(cpu); do { - if (!((asid += ASID_INC) & ASID_MASK) ) { + if (!ASID_MASK(ASID_INC(asid))) { if (cpu_has_vtag_icache) flush_icache_all(); /* Traverse all online CPUs (hack requires contiguous range) */ @@ -1414,7 +1414,7 @@ void smtc_get_new_mmu_context(struct mm_struct *mm, unsigned long cpu) mips_ihb(); } tcstat = read_tc_c0_tcstatus(); - smtc_live_asid[tlb][(tcstat & ASID_MASK)] |= (asiduse)(0x1 << i); + smtc_live_asid[tlb][ASID_MASK(tcstat)] |= (asiduse)(0x1 << i); if (!prevhalt) write_tc_c0_tchalt(0); } @@ -1423,7 +1423,7 @@ void smtc_get_new_mmu_context(struct mm_struct *mm, unsigned long cpu) asid = ASID_FIRST_VERSION; local_flush_tlb_all(); /* start new asid cycle */ } - } while (smtc_live_asid[tlb][(asid & ASID_MASK)]); + } while (smtc_live_asid[tlb][ASID_MASK(asid)]); /* * SMTC shares the TLB within VPEs and possibly across all VPEs. @@ -1461,7 +1461,7 @@ void smtc_flush_tlb_asid(unsigned long asid) tlb_read(); ehb(); ehi = read_c0_entryhi(); - if ((ehi & ASID_MASK) == asid) { + if (ASID_MASK(ehi) == asid) { /* * Invalidate only entries with specified ASID, * makiing sure all entries differ. diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c index 25225515451f..77cff1f6d050 100644 --- a/arch/mips/kernel/traps.c +++ b/arch/mips/kernel/traps.c @@ -8,8 +8,8 @@ * Copyright (C) 1998 Ulf Carlsson * Copyright (C) 1999 Silicon Graphics, Inc. * Kevin D. Kissell, kevink@mips.com and Carsten Langgaard, carstenl@mips.com - * Copyright (C) 2000, 01 MIPS Technologies, Inc. * Copyright (C) 2002, 2003, 2004, 2005, 2007 Maciej W. Rozycki + * Copyright (C) 2000, 2001, 2012 MIPS Technologies, Inc. All rights reserved. */ #include #include @@ -60,9 +60,9 @@ extern void check_wait(void); extern asmlinkage void r4k_wait(void); extern asmlinkage void rollback_handle_int(void); extern asmlinkage void handle_int(void); -extern asmlinkage void handle_tlbm(void); -extern asmlinkage void handle_tlbl(void); -extern asmlinkage void handle_tlbs(void); +extern u32 handle_tlbl[]; +extern u32 handle_tlbs[]; +extern u32 handle_tlbm[]; extern asmlinkage void handle_adel(void); extern asmlinkage void handle_ades(void); extern asmlinkage void handle_ibe(void); @@ -83,10 +83,6 @@ extern asmlinkage void handle_dsp(void); extern asmlinkage void handle_mcheck(void); extern asmlinkage void handle_reserved(void); -extern int fpu_emulator_cop1Handler(struct pt_regs *xcp, - struct mips_fpu_struct *ctx, int has_fpu, - void *__user *fault_addr); - void (*board_be_init)(void); int (*board_be_handler)(struct pt_regs *regs, int is_fixup); void (*board_nmi_handler_setup)(void); @@ -482,6 +478,12 @@ asmlinkage void do_be(struct pt_regs *regs) #define SYNC 0x0000000f #define RDHWR 0x0000003b +/* microMIPS definitions */ +#define MM_POOL32A_FUNC 0xfc00ffff +#define MM_RDHWR 0x00006b3c +#define MM_RS 0x001f0000 +#define MM_RT 0x03e00000 + /* * The ll_bit is cleared by r*_switch.S */ @@ -596,42 +598,62 @@ static int simulate_llsc(struct pt_regs *regs, unsigned int opcode) * Simulate trapping 'rdhwr' instructions to provide user accessible * registers not implemented in hardware. */ -static int simulate_rdhwr(struct pt_regs *regs, unsigned int opcode) +static int simulate_rdhwr(struct pt_regs *regs, int rd, int rt) { struct thread_info *ti = task_thread_info(current); + perf_sw_event(PERF_COUNT_SW_EMULATION_FAULTS, + 1, regs, 0); + switch (rd) { + case 0: /* CPU number */ + regs->regs[rt] = smp_processor_id(); + return 0; + case 1: /* SYNCI length */ + regs->regs[rt] = min(current_cpu_data.dcache.linesz, + current_cpu_data.icache.linesz); + return 0; + case 2: /* Read count register */ + regs->regs[rt] = read_c0_count(); + return 0; + case 3: /* Count register resolution */ + switch (current_cpu_data.cputype) { + case CPU_20KC: + case CPU_25KF: + regs->regs[rt] = 1; + break; + default: + regs->regs[rt] = 2; + } + return 0; + case 29: + regs->regs[rt] = ti->tp_value; + return 0; + default: + return -1; + } +} + +static int simulate_rdhwr_normal(struct pt_regs *regs, unsigned int opcode) +{ if ((opcode & OPCODE) == SPEC3 && (opcode & FUNC) == RDHWR) { int rd = (opcode & RD) >> 11; int rt = (opcode & RT) >> 16; - perf_sw_event(PERF_COUNT_SW_EMULATION_FAULTS, - 1, regs, 0); - switch (rd) { - case 0: /* CPU number */ - regs->regs[rt] = smp_processor_id(); - return 0; - case 1: /* SYNCI length */ - regs->regs[rt] = min(current_cpu_data.dcache.linesz, - current_cpu_data.icache.linesz); - return 0; - case 2: /* Read count register */ - regs->regs[rt] = read_c0_count(); - return 0; - case 3: /* Count register resolution */ - switch (current_cpu_data.cputype) { - case CPU_20KC: - case CPU_25KF: - regs->regs[rt] = 1; - break; - default: - regs->regs[rt] = 2; - } - return 0; - case 29: - regs->regs[rt] = ti->tp_value; - return 0; - default: - return -1; - } + + simulate_rdhwr(regs, rd, rt); + return 0; + } + + /* Not ours. */ + return -1; +} + +static int simulate_rdhwr_mm(struct pt_regs *regs, unsigned short opcode) +{ + if ((opcode & MM_POOL32A_FUNC) == MM_RDHWR) { + int rd = (opcode & MM_RS) >> 16; + int rt = (opcode & MM_RT) >> 21; + simulate_rdhwr(regs, rd, rt); + return 0; } /* Not ours. */ @@ -662,7 +684,7 @@ asmlinkage void do_ov(struct pt_regs *regs) force_sig_info(SIGFPE, &info, current); } -static int process_fpemu_return(int sig, void __user *fault_addr) +int process_fpemu_return(int sig, void __user *fault_addr) { if (sig == SIGSEGV || sig == SIGBUS) { struct siginfo si = {0}; @@ -813,9 +835,29 @@ static void do_trap_or_bp(struct pt_regs *regs, unsigned int code, asmlinkage void do_bp(struct pt_regs *regs) { unsigned int opcode, bcode; - - if (__get_user(opcode, (unsigned int __user *) exception_epc(regs))) - goto out_sigsegv; + unsigned long epc; + u16 instr[2]; + + if (get_isa16_mode(regs->cp0_epc)) { + /* Calculate EPC. */ + epc = exception_epc(regs); + if (cpu_has_mmips) { + if ((__get_user(instr[0], (u16 __user *)msk_isa16_mode(epc)) || + (__get_user(instr[1], (u16 __user *)msk_isa16_mode(epc + 2))))) + goto out_sigsegv; + opcode = (instr[0] << 16) | instr[1]; + } else { + /* MIPS16e mode */ + if (__get_user(instr[0], (u16 __user *)msk_isa16_mode(epc))) + goto out_sigsegv; + bcode = (instr[0] >> 6) & 0x3f; + do_trap_or_bp(regs, bcode, "Break"); + return; + } + } else { + if (__get_user(opcode, (unsigned int __user *) exception_epc(regs))) + goto out_sigsegv; + } /* * There is the ancient bug in the MIPS assemblers that the break @@ -856,13 +898,22 @@ out_sigsegv: asmlinkage void do_tr(struct pt_regs *regs) { unsigned int opcode, tcode = 0; + u16 instr[2]; + unsigned long epc = exception_epc(regs); - if (__get_user(opcode, (unsigned int __user *) exception_epc(regs))) - goto out_sigsegv; + if ((__get_user(instr[0], (u16 __user *)msk_isa16_mode(epc))) || + (__get_user(instr[1], (u16 __user *)msk_isa16_mode(epc + 2)))) + goto out_sigsegv; + opcode = (instr[0] << 16) | instr[1]; /* Immediate versions don't provide a code. */ - if (!(opcode & OPCODE)) - tcode = ((opcode >> 6) & ((1 << 10) - 1)); + if (!(opcode & OPCODE)) { + if (get_isa16_mode(regs->cp0_epc)) + /* microMIPS */ + tcode = (opcode >> 12) & 0x1f; + else + tcode = ((opcode >> 6) & ((1 << 10) - 1)); + } do_trap_or_bp(regs, tcode, "Trap"); return; @@ -875,6 +926,7 @@ asmlinkage void do_ri(struct pt_regs *regs) { unsigned int __user *epc = (unsigned int __user *)exception_epc(regs); unsigned long old_epc = regs->cp0_epc; + unsigned long old31 = regs->regs[31]; unsigned int opcode = 0; int status = -1; @@ -887,23 +939,37 @@ asmlinkage void do_ri(struct pt_regs *regs) if (unlikely(compute_return_epc(regs) < 0)) return; - if (unlikely(get_user(opcode, epc) < 0)) - status = SIGSEGV; + if (get_isa16_mode(regs->cp0_epc)) { + unsigned short mmop[2] = { 0 }; - if (!cpu_has_llsc && status < 0) - status = simulate_llsc(regs, opcode); + if (unlikely(get_user(mmop[0], epc) < 0)) + status = SIGSEGV; + if (unlikely(get_user(mmop[1], epc) < 0)) + status = SIGSEGV; + opcode = (mmop[0] << 16) | mmop[1]; - if (status < 0) - status = simulate_rdhwr(regs, opcode); + if (status < 0) + status = simulate_rdhwr_mm(regs, opcode); + } else { + if (unlikely(get_user(opcode, epc) < 0)) + status = SIGSEGV; - if (status < 0) - status = simulate_sync(regs, opcode); + if (!cpu_has_llsc && status < 0) + status = simulate_llsc(regs, opcode); + + if (status < 0) + status = simulate_rdhwr_normal(regs, opcode); + + if (status < 0) + status = simulate_sync(regs, opcode); + } if (status < 0) status = SIGILL; if (unlikely(status > 0)) { regs->cp0_epc = old_epc; /* Undo skip-over. */ + regs->regs[31] = old31; force_sig(status, current); } } @@ -973,7 +1039,7 @@ static int default_cu2_call(struct notifier_block *nfb, unsigned long action, asmlinkage void do_cpu(struct pt_regs *regs) { unsigned int __user *epc; - unsigned long old_epc; + unsigned long old_epc, old31; unsigned int opcode; unsigned int cpid; int status; @@ -987,26 +1053,41 @@ asmlinkage void do_cpu(struct pt_regs *regs) case 0: epc = (unsigned int __user *)exception_epc(regs); old_epc = regs->cp0_epc; + old31 = regs->regs[31]; opcode = 0; status = -1; if (unlikely(compute_return_epc(regs) < 0)) return; - if (unlikely(get_user(opcode, epc) < 0)) - status = SIGSEGV; + if (get_isa16_mode(regs->cp0_epc)) { + unsigned short mmop[2] = { 0 }; - if (!cpu_has_llsc && status < 0) - status = simulate_llsc(regs, opcode); + if (unlikely(get_user(mmop[0], epc) < 0)) + status = SIGSEGV; + if (unlikely(get_user(mmop[1], epc) < 0)) + status = SIGSEGV; + opcode = (mmop[0] << 16) | mmop[1]; - if (status < 0) - status = simulate_rdhwr(regs, opcode); + if (status < 0) + status = simulate_rdhwr_mm(regs, opcode); + } else { + if (unlikely(get_user(opcode, epc) < 0)) + status = SIGSEGV; + + if (!cpu_has_llsc && status < 0) + status = simulate_llsc(regs, opcode); + + if (status < 0) + status = simulate_rdhwr_normal(regs, opcode); + } if (status < 0) status = SIGILL; if (unlikely(status > 0)) { regs->cp0_epc = old_epc; /* Undo skip-over. */ + regs->regs[31] = old31; force_sig(status, current); } @@ -1320,7 +1401,7 @@ asmlinkage void cache_parity_error(void) void ejtag_exception_handler(struct pt_regs *regs) { const int field = 2 * sizeof(unsigned long); - unsigned long depc, old_epc; + unsigned long depc, old_epc, old_ra; unsigned int debug; printk(KERN_DEBUG "SDBBP EJTAG debug exception - not handled yet, just ignored!\n"); @@ -1335,10 +1416,12 @@ void ejtag_exception_handler(struct pt_regs *regs) * calculation. */ old_epc = regs->cp0_epc; + old_ra = regs->regs[31]; regs->cp0_epc = depc; - __compute_return_epc(regs); + compute_return_epc(regs); depc = regs->cp0_epc; regs->cp0_epc = old_epc; + regs->regs[31] = old_ra; } else depc += 4; write_c0_depc(depc); @@ -1377,11 +1460,27 @@ unsigned long vi_handlers[64]; void __init *set_except_vector(int n, void *addr) { unsigned long handler = (unsigned long) addr; - unsigned long old_handler = exception_handlers[n]; + unsigned long old_handler; + +#ifdef CONFIG_CPU_MICROMIPS + /* + * Only the TLB handlers are cache aligned with an even + * address. All other handlers are on an odd address and + * require no modification. Otherwise, MIPS32 mode will + * be entered when handling any TLB exceptions. That + * would be bad...since we must stay in microMIPS mode. + */ + if (!(handler & 0x1)) + handler |= 1; +#endif + old_handler = xchg(&exception_handlers[n], handler); - exception_handlers[n] = handler; if (n == 0 && cpu_has_divec) { +#ifdef CONFIG_CPU_MICROMIPS + unsigned long jump_mask = ~((1 << 27) - 1); +#else unsigned long jump_mask = ~((1 << 28) - 1); +#endif u32 *buf = (u32 *)(ebase + 0x200); unsigned int k0 = 26; if ((handler & jump_mask) == ((ebase + 0x200) & jump_mask)) { @@ -1397,7 +1496,7 @@ void __init *set_except_vector(int n, void *addr) return (void *)old_handler; } -static asmlinkage void do_default_vi(void) +static void do_default_vi(void) { show_regs(get_irq_regs()); panic("Caught unexpected vectored interrupt."); @@ -1408,17 +1507,18 @@ static void *set_vi_srs_handler(int n, vi_handler_t addr, int srs) unsigned long handler; unsigned long old_handler = vi_handlers[n]; int srssets = current_cpu_data.srsets; - u32 *w; + u16 *h; unsigned char *b; BUG_ON(!cpu_has_veic && !cpu_has_vint); + BUG_ON((n < 0) && (n > 9)); if (addr == NULL) { handler = (unsigned long) do_default_vi; srs = 0; } else handler = (unsigned long) addr; - vi_handlers[n] = (unsigned long) addr; + vi_handlers[n] = handler; b = (unsigned char *)(ebase + 0x200 + n*VECTORSPACING); @@ -1437,9 +1537,8 @@ static void *set_vi_srs_handler(int n, vi_handler_t addr, int srs) if (srs == 0) { /* * If no shadow set is selected then use the default handler - * that does normal register saving and a standard interrupt exit + * that does normal register saving and standard interrupt exit */ - extern char except_vec_vi, except_vec_vi_lui; extern char except_vec_vi_ori, except_vec_vi_end; extern char rollback_except_vec_vi; @@ -1452,11 +1551,20 @@ static void *set_vi_srs_handler(int n, vi_handler_t addr, int srs) * Status.IM bit to be masked before going there. */ extern char except_vec_vi_mori; +#if defined(CONFIG_CPU_MICROMIPS) || defined(CONFIG_CPU_BIG_ENDIAN) + const int mori_offset = &except_vec_vi_mori - vec_start + 2; +#else const int mori_offset = &except_vec_vi_mori - vec_start; +#endif #endif /* CONFIG_MIPS_MT_SMTC */ - const int handler_len = &except_vec_vi_end - vec_start; +#if defined(CONFIG_CPU_MICROMIPS) || defined(CONFIG_CPU_BIG_ENDIAN) + const int lui_offset = &except_vec_vi_lui - vec_start + 2; + const int ori_offset = &except_vec_vi_ori - vec_start + 2; +#else const int lui_offset = &except_vec_vi_lui - vec_start; const int ori_offset = &except_vec_vi_ori - vec_start; +#endif + const int handler_len = &except_vec_vi_end - vec_start; if (handler_len > VECTORSPACING) { /* @@ -1466,30 +1574,44 @@ static void *set_vi_srs_handler(int n, vi_handler_t addr, int srs) panic("VECTORSPACING too small"); } - memcpy(b, vec_start, handler_len); + set_handler(((unsigned long)b - ebase), vec_start, +#ifdef CONFIG_CPU_MICROMIPS + (handler_len - 1)); +#else + handler_len); +#endif #ifdef CONFIG_MIPS_MT_SMTC BUG_ON(n > 7); /* Vector index %d exceeds SMTC maximum. */ - w = (u32 *)(b + mori_offset); - *w = (*w & 0xffff0000) | (0x100 << n); + h = (u16 *)(b + mori_offset); + *h = (0x100 << n); #endif /* CONFIG_MIPS_MT_SMTC */ - w = (u32 *)(b + lui_offset); - *w = (*w & 0xffff0000) | (((u32)handler >> 16) & 0xffff); - w = (u32 *)(b + ori_offset); - *w = (*w & 0xffff0000) | ((u32)handler & 0xffff); + h = (u16 *)(b + lui_offset); + *h = (handler >> 16) & 0xffff; + h = (u16 *)(b + ori_offset); + *h = (handler & 0xffff); local_flush_icache_range((unsigned long)b, (unsigned long)(b+handler_len)); } else { /* - * In other cases jump directly to the interrupt handler - * - * It is the handlers responsibility to save registers if required - * (eg hi/lo) and return from the exception using "eret" + * In other cases jump directly to the interrupt handler. It + * is the handler's responsibility to save registers if required + * (eg hi/lo) and return from the exception using "eret". */ - w = (u32 *)b; - *w++ = 0x08000000 | (((u32)handler >> 2) & 0x03fffff); /* j handler */ - *w = 0; + u32 insn; + + h = (u16 *)b; + /* j handler */ +#ifdef CONFIG_CPU_MICROMIPS + insn = 0xd4000000 | (((u32)handler & 0x07ffffff) >> 1); +#else + insn = 0x08000000 | (((u32)handler & 0x0fffffff) >> 2); +#endif + h[0] = (insn >> 16) & 0xffff; + h[1] = insn & 0xffff; + h[2] = 0; + h[3] = 0; local_flush_icache_range((unsigned long)b, (unsigned long)(b+8)); } @@ -1534,6 +1656,7 @@ void __cpuinit per_cpu_trap_init(bool is_boot_cpu) unsigned int cpu = smp_processor_id(); unsigned int status_set = ST0_CU0; unsigned int hwrena = cpu_hwrena_impl_bits; + unsigned long asid = 0; #ifdef CONFIG_MIPS_MT_SMTC int secondaryTC = 0; int bootTC = (cpu == 0); @@ -1617,8 +1740,9 @@ void __cpuinit per_cpu_trap_init(bool is_boot_cpu) } #endif /* CONFIG_MIPS_MT_SMTC */ - if (!cpu_data[cpu].asid_cache) - cpu_data[cpu].asid_cache = ASID_FIRST_VERSION; + asid = ASID_FIRST_VERSION; + cpu_data[cpu].asid_cache = asid; + TLBMISS_HANDLER_SETUP(); atomic_inc(&init_mm.mm_count); current->active_mm = &init_mm; @@ -1648,7 +1772,11 @@ void __cpuinit per_cpu_trap_init(bool is_boot_cpu) /* Install CPU exception handler */ void __cpuinit set_handler(unsigned long offset, void *addr, unsigned long size) { +#ifdef CONFIG_CPU_MICROMIPS + memcpy((void *)(ebase + offset), ((unsigned char *)addr - 1), size); +#else memcpy((void *)(ebase + offset), addr, size); +#endif local_flush_icache_range(ebase + offset, ebase + offset + size); } @@ -1682,8 +1810,9 @@ __setup("rdhwr_noopt", set_rdhwr_noopt); void __init trap_init(void) { - extern char except_vec3_generic, except_vec3_r4000; + extern char except_vec3_generic; extern char except_vec4; + extern char except_vec3_r4000; unsigned long i; int rollback; @@ -1700,7 +1829,12 @@ void __init trap_init(void) ebase = (unsigned long) __alloc_bootmem(size, 1 << fls(size), 0); } else { - ebase = CKSEG0; +#ifdef CONFIG_KVM_GUEST +#define KVM_GUEST_KSEG0 0x40000000 + ebase = KVM_GUEST_KSEG0; +#else + ebase = CKSEG0; +#endif if (cpu_has_mips_r2) ebase += (read_c0_ebase() & 0x3ffff000); } @@ -1816,11 +1950,11 @@ void __init trap_init(void) if (cpu_has_vce) /* Special exception: R4[04]00 uses also the divec space. */ - memcpy((void *)(ebase + 0x180), &except_vec3_r4000, 0x100); + set_handler(0x180, &except_vec3_r4000, 0x100); else if (cpu_has_4kex) - memcpy((void *)(ebase + 0x180), &except_vec3_generic, 0x80); + set_handler(0x180, &except_vec3_generic, 0x80); else - memcpy((void *)(ebase + 0x080), &except_vec3_generic, 0x80); + set_handler(0x080, &except_vec3_generic, 0x80); local_flush_icache_range(ebase, ebase + 0x400); flush_tlb_handlers(); diff --git a/arch/mips/kernel/unaligned.c b/arch/mips/kernel/unaligned.c index 6087a54c86a0..203d8857070d 100644 --- a/arch/mips/kernel/unaligned.c +++ b/arch/mips/kernel/unaligned.c @@ -83,8 +83,12 @@ #include #include #include +#include +#include #include #include +#include +#include #define STR(x) __STR(x) #define __STR(x) #x @@ -102,12 +106,332 @@ static u32 unaligned_action; #endif extern void show_registers(struct pt_regs *regs); +#ifdef __BIG_ENDIAN +#define LoadHW(addr, value, res) \ + __asm__ __volatile__ (".set\tnoat\n" \ + "1:\tlb\t%0, 0(%2)\n" \ + "2:\tlbu\t$1, 1(%2)\n\t" \ + "sll\t%0, 0x8\n\t" \ + "or\t%0, $1\n\t" \ + "li\t%1, 0\n" \ + "3:\t.set\tat\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define LoadW(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tlwl\t%0, (%2)\n" \ + "2:\tlwr\t%0, 3(%2)\n\t" \ + "li\t%1, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define LoadHWU(addr, value, res) \ + __asm__ __volatile__ ( \ + ".set\tnoat\n" \ + "1:\tlbu\t%0, 0(%2)\n" \ + "2:\tlbu\t$1, 1(%2)\n\t" \ + "sll\t%0, 0x8\n\t" \ + "or\t%0, $1\n\t" \ + "li\t%1, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".set\tat\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define LoadWU(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tlwl\t%0, (%2)\n" \ + "2:\tlwr\t%0, 3(%2)\n\t" \ + "dsll\t%0, %0, 32\n\t" \ + "dsrl\t%0, %0, 32\n\t" \ + "li\t%1, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + "\t.section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define LoadDW(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tldl\t%0, (%2)\n" \ + "2:\tldr\t%0, 7(%2)\n\t" \ + "li\t%1, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + "\t.section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define StoreHW(addr, value, res) \ + __asm__ __volatile__ ( \ + ".set\tnoat\n" \ + "1:\tsb\t%1, 1(%2)\n\t" \ + "srl\t$1, %1, 0x8\n" \ + "2:\tsb\t$1, 0(%2)\n\t" \ + ".set\tat\n\t" \ + "li\t%0, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%0, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=r" (res) \ + : "r" (value), "r" (addr), "i" (-EFAULT)); + +#define StoreW(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tswl\t%1,(%2)\n" \ + "2:\tswr\t%1, 3(%2)\n\t" \ + "li\t%0, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%0, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=r" (res) \ + : "r" (value), "r" (addr), "i" (-EFAULT)); + +#define StoreDW(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tsdl\t%1,(%2)\n" \ + "2:\tsdr\t%1, 7(%2)\n\t" \ + "li\t%0, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%0, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=r" (res) \ + : "r" (value), "r" (addr), "i" (-EFAULT)); +#endif + +#ifdef __LITTLE_ENDIAN +#define LoadHW(addr, value, res) \ + __asm__ __volatile__ (".set\tnoat\n" \ + "1:\tlb\t%0, 1(%2)\n" \ + "2:\tlbu\t$1, 0(%2)\n\t" \ + "sll\t%0, 0x8\n\t" \ + "or\t%0, $1\n\t" \ + "li\t%1, 0\n" \ + "3:\t.set\tat\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define LoadW(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tlwl\t%0, 3(%2)\n" \ + "2:\tlwr\t%0, (%2)\n\t" \ + "li\t%1, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define LoadHWU(addr, value, res) \ + __asm__ __volatile__ ( \ + ".set\tnoat\n" \ + "1:\tlbu\t%0, 1(%2)\n" \ + "2:\tlbu\t$1, 0(%2)\n\t" \ + "sll\t%0, 0x8\n\t" \ + "or\t%0, $1\n\t" \ + "li\t%1, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".set\tat\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define LoadWU(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tlwl\t%0, 3(%2)\n" \ + "2:\tlwr\t%0, (%2)\n\t" \ + "dsll\t%0, %0, 32\n\t" \ + "dsrl\t%0, %0, 32\n\t" \ + "li\t%1, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + "\t.section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define LoadDW(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tldl\t%0, 7(%2)\n" \ + "2:\tldr\t%0, (%2)\n\t" \ + "li\t%1, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + "\t.section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%1, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=&r" (value), "=r" (res) \ + : "r" (addr), "i" (-EFAULT)); + +#define StoreHW(addr, value, res) \ + __asm__ __volatile__ ( \ + ".set\tnoat\n" \ + "1:\tsb\t%1, 0(%2)\n\t" \ + "srl\t$1,%1, 0x8\n" \ + "2:\tsb\t$1, 1(%2)\n\t" \ + ".set\tat\n\t" \ + "li\t%0, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%0, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=r" (res) \ + : "r" (value), "r" (addr), "i" (-EFAULT)); + +#define StoreW(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tswl\t%1, 3(%2)\n" \ + "2:\tswr\t%1, (%2)\n\t" \ + "li\t%0, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%0, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=r" (res) \ + : "r" (value), "r" (addr), "i" (-EFAULT)); + +#define StoreDW(addr, value, res) \ + __asm__ __volatile__ ( \ + "1:\tsdl\t%1, 7(%2)\n" \ + "2:\tsdr\t%1, (%2)\n\t" \ + "li\t%0, 0\n" \ + "3:\n\t" \ + ".insn\n\t" \ + ".section\t.fixup,\"ax\"\n\t" \ + "4:\tli\t%0, %3\n\t" \ + "j\t3b\n\t" \ + ".previous\n\t" \ + ".section\t__ex_table,\"a\"\n\t" \ + STR(PTR)"\t1b, 4b\n\t" \ + STR(PTR)"\t2b, 4b\n\t" \ + ".previous" \ + : "=r" (res) \ + : "r" (value), "r" (addr), "i" (-EFAULT)); +#endif + static void emulate_load_store_insn(struct pt_regs *regs, void __user *addr, unsigned int __user *pc) { union mips_instruction insn; unsigned long value; unsigned int res; + unsigned long origpc; + unsigned long orig31; + void __user *fault_addr = NULL; + + origpc = (unsigned long)pc; + orig31 = regs->regs[31]; perf_sw_event(PERF_COUNT_SW_EMULATION_FAULTS, 1, regs, 0); @@ -117,22 +441,22 @@ static void emulate_load_store_insn(struct pt_regs *regs, __get_user(insn.word, pc); switch (insn.i_format.opcode) { - /* - * These are instructions that a compiler doesn't generate. We - * can assume therefore that the code is MIPS-aware and - * really buggy. Emulating these instructions would break the - * semantics anyway. - */ + /* + * These are instructions that a compiler doesn't generate. We + * can assume therefore that the code is MIPS-aware and + * really buggy. Emulating these instructions would break the + * semantics anyway. + */ case ll_op: case lld_op: case sc_op: case scd_op: - /* - * For these instructions the only way to create an address - * error is an attempted access to kernel/supervisor address - * space. - */ + /* + * For these instructions the only way to create an address + * error is an attempted access to kernel/supervisor address + * space. + */ case ldl_op: case ldr_op: case lwl_op: @@ -146,36 +470,15 @@ static void emulate_load_store_insn(struct pt_regs *regs, case sb_op: goto sigbus; - /* - * The remaining opcodes are the ones that are really of interest. - */ + /* + * The remaining opcodes are the ones that are really of + * interest. + */ case lh_op: if (!access_ok(VERIFY_READ, addr, 2)) goto sigbus; - __asm__ __volatile__ (".set\tnoat\n" -#ifdef __BIG_ENDIAN - "1:\tlb\t%0, 0(%2)\n" - "2:\tlbu\t$1, 1(%2)\n\t" -#endif -#ifdef __LITTLE_ENDIAN - "1:\tlb\t%0, 1(%2)\n" - "2:\tlbu\t$1, 0(%2)\n\t" -#endif - "sll\t%0, 0x8\n\t" - "or\t%0, $1\n\t" - "li\t%1, 0\n" - "3:\t.set\tat\n\t" - ".section\t.fixup,\"ax\"\n\t" - "4:\tli\t%1, %3\n\t" - "j\t3b\n\t" - ".previous\n\t" - ".section\t__ex_table,\"a\"\n\t" - STR(PTR)"\t1b, 4b\n\t" - STR(PTR)"\t2b, 4b\n\t" - ".previous" - : "=&r" (value), "=r" (res) - : "r" (addr), "i" (-EFAULT)); + LoadHW(addr, value, res); if (res) goto fault; compute_return_epc(regs); @@ -186,26 +489,7 @@ static void emulate_load_store_insn(struct pt_regs *regs, if (!access_ok(VERIFY_READ, addr, 4)) goto sigbus; - __asm__ __volatile__ ( -#ifdef __BIG_ENDIAN - "1:\tlwl\t%0, (%2)\n" - "2:\tlwr\t%0, 3(%2)\n\t" -#endif -#ifdef __LITTLE_ENDIAN - "1:\tlwl\t%0, 3(%2)\n" - "2:\tlwr\t%0, (%2)\n\t" -#endif - "li\t%1, 0\n" - "3:\t.section\t.fixup,\"ax\"\n\t" - "4:\tli\t%1, %3\n\t" - "j\t3b\n\t" - ".previous\n\t" - ".section\t__ex_table,\"a\"\n\t" - STR(PTR)"\t1b, 4b\n\t" - STR(PTR)"\t2b, 4b\n\t" - ".previous" - : "=&r" (value), "=r" (res) - : "r" (addr), "i" (-EFAULT)); + LoadW(addr, value, res); if (res) goto fault; compute_return_epc(regs); @@ -216,30 +500,7 @@ static void emulate_load_store_insn(struct pt_regs *regs, if (!access_ok(VERIFY_READ, addr, 2)) goto sigbus; - __asm__ __volatile__ ( - ".set\tnoat\n" -#ifdef __BIG_ENDIAN - "1:\tlbu\t%0, 0(%2)\n" - "2:\tlbu\t$1, 1(%2)\n\t" -#endif -#ifdef __LITTLE_ENDIAN - "1:\tlbu\t%0, 1(%2)\n" - "2:\tlbu\t$1, 0(%2)\n\t" -#endif - "sll\t%0, 0x8\n\t" - "or\t%0, $1\n\t" - "li\t%1, 0\n" - "3:\t.set\tat\n\t" - ".section\t.fixup,\"ax\"\n\t" - "4:\tli\t%1, %3\n\t" - "j\t3b\n\t" - ".previous\n\t" - ".section\t__ex_table,\"a\"\n\t" - STR(PTR)"\t1b, 4b\n\t" - STR(PTR)"\t2b, 4b\n\t" - ".previous" - : "=&r" (value), "=r" (res) - : "r" (addr), "i" (-EFAULT)); + LoadHWU(addr, value, res); if (res) goto fault; compute_return_epc(regs); @@ -258,28 +519,7 @@ static void emulate_load_store_insn(struct pt_regs *regs, if (!access_ok(VERIFY_READ, addr, 4)) goto sigbus; - __asm__ __volatile__ ( -#ifdef __BIG_ENDIAN - "1:\tlwl\t%0, (%2)\n" - "2:\tlwr\t%0, 3(%2)\n\t" -#endif -#ifdef __LITTLE_ENDIAN - "1:\tlwl\t%0, 3(%2)\n" - "2:\tlwr\t%0, (%2)\n\t" -#endif - "dsll\t%0, %0, 32\n\t" - "dsrl\t%0, %0, 32\n\t" - "li\t%1, 0\n" - "3:\t.section\t.fixup,\"ax\"\n\t" - "4:\tli\t%1, %3\n\t" - "j\t3b\n\t" - ".previous\n\t" - ".section\t__ex_table,\"a\"\n\t" - STR(PTR)"\t1b, 4b\n\t" - STR(PTR)"\t2b, 4b\n\t" - ".previous" - : "=&r" (value), "=r" (res) - : "r" (addr), "i" (-EFAULT)); + LoadWU(addr, value, res); if (res) goto fault; compute_return_epc(regs); @@ -302,26 +542,7 @@ static void emulate_load_store_insn(struct pt_regs *regs, if (!access_ok(VERIFY_READ, addr, 8)) goto sigbus; - __asm__ __volatile__ ( -#ifdef __BIG_ENDIAN - "1:\tldl\t%0, (%2)\n" - "2:\tldr\t%0, 7(%2)\n\t" -#endif -#ifdef __LITTLE_ENDIAN - "1:\tldl\t%0, 7(%2)\n" - "2:\tldr\t%0, (%2)\n\t" -#endif - "li\t%1, 0\n" - "3:\t.section\t.fixup,\"ax\"\n\t" - "4:\tli\t%1, %3\n\t" - "j\t3b\n\t" - ".previous\n\t" - ".section\t__ex_table,\"a\"\n\t" - STR(PTR)"\t1b, 4b\n\t" - STR(PTR)"\t2b, 4b\n\t" - ".previous" - : "=&r" (value), "=r" (res) - : "r" (addr), "i" (-EFAULT)); + LoadDW(addr, value, res); if (res) goto fault; compute_return_epc(regs); @@ -336,68 +557,22 @@ static void emulate_load_store_insn(struct pt_regs *regs, if (!access_ok(VERIFY_WRITE, addr, 2)) goto sigbus; + compute_return_epc(regs); value = regs->regs[insn.i_format.rt]; - __asm__ __volatile__ ( -#ifdef __BIG_ENDIAN - ".set\tnoat\n" - "1:\tsb\t%1, 1(%2)\n\t" - "srl\t$1, %1, 0x8\n" - "2:\tsb\t$1, 0(%2)\n\t" - ".set\tat\n\t" -#endif -#ifdef __LITTLE_ENDIAN - ".set\tnoat\n" - "1:\tsb\t%1, 0(%2)\n\t" - "srl\t$1,%1, 0x8\n" - "2:\tsb\t$1, 1(%2)\n\t" - ".set\tat\n\t" -#endif - "li\t%0, 0\n" - "3:\n\t" - ".section\t.fixup,\"ax\"\n\t" - "4:\tli\t%0, %3\n\t" - "j\t3b\n\t" - ".previous\n\t" - ".section\t__ex_table,\"a\"\n\t" - STR(PTR)"\t1b, 4b\n\t" - STR(PTR)"\t2b, 4b\n\t" - ".previous" - : "=r" (res) - : "r" (value), "r" (addr), "i" (-EFAULT)); + StoreHW(addr, value, res); if (res) goto fault; - compute_return_epc(regs); break; case sw_op: if (!access_ok(VERIFY_WRITE, addr, 4)) goto sigbus; + compute_return_epc(regs); value = regs->regs[insn.i_format.rt]; - __asm__ __volatile__ ( -#ifdef __BIG_ENDIAN - "1:\tswl\t%1,(%2)\n" - "2:\tswr\t%1, 3(%2)\n\t" -#endif -#ifdef __LITTLE_ENDIAN - "1:\tswl\t%1, 3(%2)\n" - "2:\tswr\t%1, (%2)\n\t" -#endif - "li\t%0, 0\n" - "3:\n\t" - ".section\t.fixup,\"ax\"\n\t" - "4:\tli\t%0, %3\n\t" - "j\t3b\n\t" - ".previous\n\t" - ".section\t__ex_table,\"a\"\n\t" - STR(PTR)"\t1b, 4b\n\t" - STR(PTR)"\t2b, 4b\n\t" - ".previous" - : "=r" (res) - : "r" (value), "r" (addr), "i" (-EFAULT)); + StoreW(addr, value, res); if (res) goto fault; - compute_return_epc(regs); break; case sd_op: @@ -412,31 +587,11 @@ static void emulate_load_store_insn(struct pt_regs *regs, if (!access_ok(VERIFY_WRITE, addr, 8)) goto sigbus; + compute_return_epc(regs); value = regs->regs[insn.i_format.rt]; - __asm__ __volatile__ ( -#ifdef __BIG_ENDIAN - "1:\tsdl\t%1,(%2)\n" - "2:\tsdr\t%1, 7(%2)\n\t" -#endif -#ifdef __LITTLE_ENDIAN - "1:\tsdl\t%1, 7(%2)\n" - "2:\tsdr\t%1, (%2)\n\t" -#endif - "li\t%0, 0\n" - "3:\n\t" - ".section\t.fixup,\"ax\"\n\t" - "4:\tli\t%0, %3\n\t" - "j\t3b\n\t" - ".previous\n\t" - ".section\t__ex_table,\"a\"\n\t" - STR(PTR)"\t1b, 4b\n\t" - STR(PTR)"\t2b, 4b\n\t" - ".previous" - : "=r" (res) - : "r" (value), "r" (addr), "i" (-EFAULT)); + StoreDW(addr, value, res); if (res) goto fault; - compute_return_epc(regs); break; #endif /* CONFIG_64BIT */ @@ -447,10 +602,21 @@ static void emulate_load_store_insn(struct pt_regs *regs, case ldc1_op: case swc1_op: case sdc1_op: - /* - * I herewith declare: this does not happen. So send SIGBUS. - */ - goto sigbus; + die_if_kernel("Unaligned FP access in kernel code", regs); + BUG_ON(!used_math()); + BUG_ON(!is_fpu_owner()); + + lose_fpu(1); /* Save FPU state for the emulator. */ + res = fpu_emulator_cop1Handler(regs, ¤t->thread.fpu, 1, + &fault_addr); + own_fpu(1); /* Restore FPU state. */ + + /* Signal if something went wrong. */ + process_fpemu_return(res, fault_addr); + + if (res == 0) + break; + return; /* * COP2 is available to implementor for application specific use. @@ -488,6 +654,9 @@ static void emulate_load_store_insn(struct pt_regs *regs, return; fault: + /* roll back jump/branch */ + regs->cp0_epc = origpc; + regs->regs[31] = orig31; /* Did we have an exception handler installed? */ if (fixup_exception(regs)) return; @@ -504,10 +673,881 @@ sigbus: return; sigill: - die_if_kernel("Unhandled kernel unaligned access or invalid instruction", regs); + die_if_kernel + ("Unhandled kernel unaligned access or invalid instruction", regs); force_sig(SIGILL, current); } +/* Recode table from 16-bit register notation to 32-bit GPR. */ +const int reg16to32[] = { 16, 17, 2, 3, 4, 5, 6, 7 }; + +/* Recode table from 16-bit STORE register notation to 32-bit GPR. */ +const int reg16to32st[] = { 0, 17, 2, 3, 4, 5, 6, 7 }; + +void emulate_load_store_microMIPS(struct pt_regs *regs, void __user * addr) +{ + unsigned long value; + unsigned int res; + int i; + unsigned int reg = 0, rvar; + unsigned long orig31; + u16 __user *pc16; + u16 halfword; + unsigned int word; + unsigned long origpc, contpc; + union mips_instruction insn; + struct mm_decoded_insn mminsn; + void __user *fault_addr = NULL; + + origpc = regs->cp0_epc; + orig31 = regs->regs[31]; + + mminsn.micro_mips_mode = 1; + + /* + * This load never faults. + */ + pc16 = (unsigned short __user *)msk_isa16_mode(regs->cp0_epc); + __get_user(halfword, pc16); + pc16++; + contpc = regs->cp0_epc + 2; + word = ((unsigned int)halfword << 16); + mminsn.pc_inc = 2; + + if (!mm_insn_16bit(halfword)) { + __get_user(halfword, pc16); + pc16++; + contpc = regs->cp0_epc + 4; + mminsn.pc_inc = 4; + word |= halfword; + } + mminsn.insn = word; + + if (get_user(halfword, pc16)) + goto fault; + mminsn.next_pc_inc = 2; + word = ((unsigned int)halfword << 16); + + if (!mm_insn_16bit(halfword)) { + pc16++; + if (get_user(halfword, pc16)) + goto fault; + mminsn.next_pc_inc = 4; + word |= halfword; + } + mminsn.next_insn = word; + + insn = (union mips_instruction)(mminsn.insn); + if (mm_isBranchInstr(regs, mminsn, &contpc)) + insn = (union mips_instruction)(mminsn.next_insn); + + /* Parse instruction to find what to do */ + + switch (insn.mm_i_format.opcode) { + + case mm_pool32a_op: + switch (insn.mm_x_format.func) { + case mm_lwxs_op: + reg = insn.mm_x_format.rd; + goto loadW; + } + + goto sigbus; + + case mm_pool32b_op: + switch (insn.mm_m_format.func) { + case mm_lwp_func: + reg = insn.mm_m_format.rd; + if (reg == 31) + goto sigbus; + + if (!access_ok(VERIFY_READ, addr, 8)) + goto sigbus; + + LoadW(addr, value, res); + if (res) + goto fault; + regs->regs[reg] = value; + addr += 4; + LoadW(addr, value, res); + if (res) + goto fault; + regs->regs[reg + 1] = value; + goto success; + + case mm_swp_func: + reg = insn.mm_m_format.rd; + if (reg == 31) + goto sigbus; + + if (!access_ok(VERIFY_WRITE, addr, 8)) + goto sigbus; + + value = regs->regs[reg]; + StoreW(addr, value, res); + if (res) + goto fault; + addr += 4; + value = regs->regs[reg + 1]; + StoreW(addr, value, res); + if (res) + goto fault; + goto success; + + case mm_ldp_func: +#ifdef CONFIG_64BIT + reg = insn.mm_m_format.rd; + if (reg == 31) + goto sigbus; + + if (!access_ok(VERIFY_READ, addr, 16)) + goto sigbus; + + LoadDW(addr, value, res); + if (res) + goto fault; + regs->regs[reg] = value; + addr += 8; + LoadDW(addr, value, res); + if (res) + goto fault; + regs->regs[reg + 1] = value; + goto success; +#endif /* CONFIG_64BIT */ + + goto sigill; + + case mm_sdp_func: +#ifdef CONFIG_64BIT + reg = insn.mm_m_format.rd; + if (reg == 31) + goto sigbus; + + if (!access_ok(VERIFY_WRITE, addr, 16)) + goto sigbus; + + value = regs->regs[reg]; + StoreDW(addr, value, res); + if (res) + goto fault; + addr += 8; + value = regs->regs[reg + 1]; + StoreDW(addr, value, res); + if (res) + goto fault; + goto success; +#endif /* CONFIG_64BIT */ + + goto sigill; + + case mm_lwm32_func: + reg = insn.mm_m_format.rd; + rvar = reg & 0xf; + if ((rvar > 9) || !reg) + goto sigill; + if (reg & 0x10) { + if (!access_ok + (VERIFY_READ, addr, 4 * (rvar + 1))) + goto sigbus; + } else { + if (!access_ok(VERIFY_READ, addr, 4 * rvar)) + goto sigbus; + } + if (rvar == 9) + rvar = 8; + for (i = 16; rvar; rvar--, i++) { + LoadW(addr, value, res); + if (res) + goto fault; + addr += 4; + regs->regs[i] = value; + } + if ((reg & 0xf) == 9) { + LoadW(addr, value, res); + if (res) + goto fault; + addr += 4; + regs->regs[30] = value; + } + if (reg & 0x10) { + LoadW(addr, value, res); + if (res) + goto fault; + regs->regs[31] = value; + } + goto success; + + case mm_swm32_func: + reg = insn.mm_m_format.rd; + rvar = reg & 0xf; + if ((rvar > 9) || !reg) + goto sigill; + if (reg & 0x10) { + if (!access_ok + (VERIFY_WRITE, addr, 4 * (rvar + 1))) + goto sigbus; + } else { + if (!access_ok(VERIFY_WRITE, addr, 4 * rvar)) + goto sigbus; + } + if (rvar == 9) + rvar = 8; + for (i = 16; rvar; rvar--, i++) { + value = regs->regs[i]; + StoreW(addr, value, res); + if (res) + goto fault; + addr += 4; + } + if ((reg & 0xf) == 9) { + value = regs->regs[30]; + StoreW(addr, value, res); + if (res) + goto fault; + addr += 4; + } + if (reg & 0x10) { + value = regs->regs[31]; + StoreW(addr, value, res); + if (res) + goto fault; + } + goto success; + + case mm_ldm_func: +#ifdef CONFIG_64BIT + reg = insn.mm_m_format.rd; + rvar = reg & 0xf; + if ((rvar > 9) || !reg) + goto sigill; + if (reg & 0x10) { + if (!access_ok + (VERIFY_READ, addr, 8 * (rvar + 1))) + goto sigbus; + } else { + if (!access_ok(VERIFY_READ, addr, 8 * rvar)) + goto sigbus; + } + if (rvar == 9) + rvar = 8; + + for (i = 16; rvar; rvar--, i++) { + LoadDW(addr, value, res); + if (res) + goto fault; + addr += 4; + regs->regs[i] = value; + } + if ((reg & 0xf) == 9) { + LoadDW(addr, value, res); + if (res) + goto fault; + addr += 8; + regs->regs[30] = value; + } + if (reg & 0x10) { + LoadDW(addr, value, res); + if (res) + goto fault; + regs->regs[31] = value; + } + goto success; +#endif /* CONFIG_64BIT */ + + goto sigill; + + case mm_sdm_func: +#ifdef CONFIG_64BIT + reg = insn.mm_m_format.rd; + rvar = reg & 0xf; + if ((rvar > 9) || !reg) + goto sigill; + if (reg & 0x10) { + if (!access_ok + (VERIFY_WRITE, addr, 8 * (rvar + 1))) + goto sigbus; + } else { + if (!access_ok(VERIFY_WRITE, addr, 8 * rvar)) + goto sigbus; + } + if (rvar == 9) + rvar = 8; + + for (i = 16; rvar; rvar--, i++) { + value = regs->regs[i]; + StoreDW(addr, value, res); + if (res) + goto fault; + addr += 8; + } + if ((reg & 0xf) == 9) { + value = regs->regs[30]; + StoreDW(addr, value, res); + if (res) + goto fault; + addr += 8; + } + if (reg & 0x10) { + value = regs->regs[31]; + StoreDW(addr, value, res); + if (res) + goto fault; + } + goto success; +#endif /* CONFIG_64BIT */ + + goto sigill; + + /* LWC2, SWC2, LDC2, SDC2 are not serviced */ + } + + goto sigbus; + + case mm_pool32c_op: + switch (insn.mm_m_format.func) { + case mm_lwu_func: + reg = insn.mm_m_format.rd; + goto loadWU; + } + + /* LL,SC,LLD,SCD are not serviced */ + goto sigbus; + + case mm_pool32f_op: + switch (insn.mm_x_format.func) { + case mm_lwxc1_func: + case mm_swxc1_func: + case mm_ldxc1_func: + case mm_sdxc1_func: + goto fpu_emul; + } + + goto sigbus; + + case mm_ldc132_op: + case mm_sdc132_op: + case mm_lwc132_op: + case mm_swc132_op: +fpu_emul: + /* roll back jump/branch */ + regs->cp0_epc = origpc; + regs->regs[31] = orig31; + + die_if_kernel("Unaligned FP access in kernel code", regs); + BUG_ON(!used_math()); + BUG_ON(!is_fpu_owner()); + + lose_fpu(1); /* save the FPU state for the emulator */ + res = fpu_emulator_cop1Handler(regs, ¤t->thread.fpu, 1, + &fault_addr); + own_fpu(1); /* restore FPU state */ + + /* If something went wrong, signal */ + process_fpemu_return(res, fault_addr); + + if (res == 0) + goto success; + return; + + case mm_lh32_op: + reg = insn.mm_i_format.rt; + goto loadHW; + + case mm_lhu32_op: + reg = insn.mm_i_format.rt; + goto loadHWU; + + case mm_lw32_op: + reg = insn.mm_i_format.rt; + goto loadW; + + case mm_sh32_op: + reg = insn.mm_i_format.rt; + goto storeHW; + + case mm_sw32_op: + reg = insn.mm_i_format.rt; + goto storeW; + + case mm_ld32_op: + reg = insn.mm_i_format.rt; + goto loadDW; + + case mm_sd32_op: + reg = insn.mm_i_format.rt; + goto storeDW; + + case mm_pool16c_op: + switch (insn.mm16_m_format.func) { + case mm_lwm16_op: + reg = insn.mm16_m_format.rlist; + rvar = reg + 1; + if (!access_ok(VERIFY_READ, addr, 4 * rvar)) + goto sigbus; + + for (i = 16; rvar; rvar--, i++) { + LoadW(addr, value, res); + if (res) + goto fault; + addr += 4; + regs->regs[i] = value; + } + LoadW(addr, value, res); + if (res) + goto fault; + regs->regs[31] = value; + + goto success; + + case mm_swm16_op: + reg = insn.mm16_m_format.rlist; + rvar = reg + 1; + if (!access_ok(VERIFY_WRITE, addr, 4 * rvar)) + goto sigbus; + + for (i = 16; rvar; rvar--, i++) { + value = regs->regs[i]; + StoreW(addr, value, res); + if (res) + goto fault; + addr += 4; + } + value = regs->regs[31]; + StoreW(addr, value, res); + if (res) + goto fault; + + goto success; + + } + + goto sigbus; + + case mm_lhu16_op: + reg = reg16to32[insn.mm16_rb_format.rt]; + goto loadHWU; + + case mm_lw16_op: + reg = reg16to32[insn.mm16_rb_format.rt]; + goto loadW; + + case mm_sh16_op: + reg = reg16to32st[insn.mm16_rb_format.rt]; + goto storeHW; + + case mm_sw16_op: + reg = reg16to32st[insn.mm16_rb_format.rt]; + goto storeW; + + case mm_lwsp16_op: + reg = insn.mm16_r5_format.rt; + goto loadW; + + case mm_swsp16_op: + reg = insn.mm16_r5_format.rt; + goto storeW; + + case mm_lwgp16_op: + reg = reg16to32[insn.mm16_r3_format.rt]; + goto loadW; + + default: + goto sigill; + } + +loadHW: + if (!access_ok(VERIFY_READ, addr, 2)) + goto sigbus; + + LoadHW(addr, value, res); + if (res) + goto fault; + regs->regs[reg] = value; + goto success; + +loadHWU: + if (!access_ok(VERIFY_READ, addr, 2)) + goto sigbus; + + LoadHWU(addr, value, res); + if (res) + goto fault; + regs->regs[reg] = value; + goto success; + +loadW: + if (!access_ok(VERIFY_READ, addr, 4)) + goto sigbus; + + LoadW(addr, value, res); + if (res) + goto fault; + regs->regs[reg] = value; + goto success; + +loadWU: +#ifdef CONFIG_64BIT + /* + * A 32-bit kernel might be running on a 64-bit processor. But + * if we're on a 32-bit processor and an i-cache incoherency + * or race makes us see a 64-bit instruction here the sdl/sdr + * would blow up, so for now we don't handle unaligned 64-bit + * instructions on 32-bit kernels. + */ + if (!access_ok(VERIFY_READ, addr, 4)) + goto sigbus; + + LoadWU(addr, value, res); + if (res) + goto fault; + regs->regs[reg] = value; + goto success; +#endif /* CONFIG_64BIT */ + + /* Cannot handle 64-bit instructions in 32-bit kernel */ + goto sigill; + +loadDW: +#ifdef CONFIG_64BIT + /* + * A 32-bit kernel might be running on a 64-bit processor. But + * if we're on a 32-bit processor and an i-cache incoherency + * or race makes us see a 64-bit instruction here the sdl/sdr + * would blow up, so for now we don't handle unaligned 64-bit + * instructions on 32-bit kernels. + */ + if (!access_ok(VERIFY_READ, addr, 8)) + goto sigbus; + + LoadDW(addr, value, res); + if (res) + goto fault; + regs->regs[reg] = value; + goto success; +#endif /* CONFIG_64BIT */ + + /* Cannot handle 64-bit instructions in 32-bit kernel */ + goto sigill; + +storeHW: + if (!access_ok(VERIFY_WRITE, addr, 2)) + goto sigbus; + + value = regs->regs[reg]; + StoreHW(addr, value, res); + if (res) + goto fault; + goto success; + +storeW: + if (!access_ok(VERIFY_WRITE, addr, 4)) + goto sigbus; + + value = regs->regs[reg]; + StoreW(addr, value, res); + if (res) + goto fault; + goto success; + +storeDW: +#ifdef CONFIG_64BIT + /* + * A 32-bit kernel might be running on a 64-bit processor. But + * if we're on a 32-bit processor and an i-cache incoherency + * or race makes us see a 64-bit instruction here the sdl/sdr + * would blow up, so for now we don't handle unaligned 64-bit + * instructions on 32-bit kernels. + */ + if (!access_ok(VERIFY_WRITE, addr, 8)) + goto sigbus; + + value = regs->regs[reg]; + StoreDW(addr, value, res); + if (res) + goto fault; + goto success; +#endif /* CONFIG_64BIT */ + + /* Cannot handle 64-bit instructions in 32-bit kernel */ + goto sigill; + +success: + regs->cp0_epc = contpc; /* advance or branch */ + +#ifdef CONFIG_DEBUG_FS + unaligned_instructions++; +#endif + return; + +fault: + /* roll back jump/branch */ + regs->cp0_epc = origpc; + regs->regs[31] = orig31; + /* Did we have an exception handler installed? */ + if (fixup_exception(regs)) + return; + + die_if_kernel("Unhandled kernel unaligned access", regs); + force_sig(SIGSEGV, current); + + return; + +sigbus: + die_if_kernel("Unhandled kernel unaligned access", regs); + force_sig(SIGBUS, current); + + return; + +sigill: + die_if_kernel + ("Unhandled kernel unaligned access or invalid instruction", regs); + force_sig(SIGILL, current); +} + +static void emulate_load_store_MIPS16e(struct pt_regs *regs, void __user * addr) +{ + unsigned long value; + unsigned int res; + int reg; + unsigned long orig31; + u16 __user *pc16; + unsigned long origpc; + union mips16e_instruction mips16inst, oldinst; + + origpc = regs->cp0_epc; + orig31 = regs->regs[31]; + pc16 = (unsigned short __user *)msk_isa16_mode(origpc); + /* + * This load never faults. + */ + __get_user(mips16inst.full, pc16); + oldinst = mips16inst; + + /* skip EXTEND instruction */ + if (mips16inst.ri.opcode == MIPS16e_extend_op) { + pc16++; + __get_user(mips16inst.full, pc16); + } else if (delay_slot(regs)) { + /* skip jump instructions */ + /* JAL/JALX are 32 bits but have OPCODE in first short int */ + if (mips16inst.ri.opcode == MIPS16e_jal_op) + pc16++; + pc16++; + if (get_user(mips16inst.full, pc16)) + goto sigbus; + } + + switch (mips16inst.ri.opcode) { + case MIPS16e_i64_op: /* I64 or RI64 instruction */ + switch (mips16inst.i64.func) { /* I64/RI64 func field check */ + case MIPS16e_ldpc_func: + case MIPS16e_ldsp_func: + reg = reg16to32[mips16inst.ri64.ry]; + goto loadDW; + + case MIPS16e_sdsp_func: + reg = reg16to32[mips16inst.ri64.ry]; + goto writeDW; + + case MIPS16e_sdrasp_func: + reg = 29; /* GPRSP */ + goto writeDW; + } + + goto sigbus; + + case MIPS16e_swsp_op: + case MIPS16e_lwpc_op: + case MIPS16e_lwsp_op: + reg = reg16to32[mips16inst.ri.rx]; + break; + + case MIPS16e_i8_op: + if (mips16inst.i8.func != MIPS16e_swrasp_func) + goto sigbus; + reg = 29; /* GPRSP */ + break; + + default: + reg = reg16to32[mips16inst.rri.ry]; + break; + } + + switch (mips16inst.ri.opcode) { + + case MIPS16e_lb_op: + case MIPS16e_lbu_op: + case MIPS16e_sb_op: + goto sigbus; + + case MIPS16e_lh_op: + if (!access_ok(VERIFY_READ, addr, 2)) + goto sigbus; + + LoadHW(addr, value, res); + if (res) + goto fault; + MIPS16e_compute_return_epc(regs, &oldinst); + regs->regs[reg] = value; + break; + + case MIPS16e_lhu_op: + if (!access_ok(VERIFY_READ, addr, 2)) + goto sigbus; + + LoadHWU(addr, value, res); + if (res) + goto fault; + MIPS16e_compute_return_epc(regs, &oldinst); + regs->regs[reg] = value; + break; + + case MIPS16e_lw_op: + case MIPS16e_lwpc_op: + case MIPS16e_lwsp_op: + if (!access_ok(VERIFY_READ, addr, 4)) + goto sigbus; + + LoadW(addr, value, res); + if (res) + goto fault; + MIPS16e_compute_return_epc(regs, &oldinst); + regs->regs[reg] = value; + break; + + case MIPS16e_lwu_op: +#ifdef CONFIG_64BIT + /* + * A 32-bit kernel might be running on a 64-bit processor. But + * if we're on a 32-bit processor and an i-cache incoherency + * or race makes us see a 64-bit instruction here the sdl/sdr + * would blow up, so for now we don't handle unaligned 64-bit + * instructions on 32-bit kernels. + */ + if (!access_ok(VERIFY_READ, addr, 4)) + goto sigbus; + + LoadWU(addr, value, res); + if (res) + goto fault; + MIPS16e_compute_return_epc(regs, &oldinst); + regs->regs[reg] = value; + break; +#endif /* CONFIG_64BIT */ + + /* Cannot handle 64-bit instructions in 32-bit kernel */ + goto sigill; + + case MIPS16e_ld_op: +loadDW: +#ifdef CONFIG_64BIT + /* + * A 32-bit kernel might be running on a 64-bit processor. But + * if we're on a 32-bit processor and an i-cache incoherency + * or race makes us see a 64-bit instruction here the sdl/sdr + * would blow up, so for now we don't handle unaligned 64-bit + * instructions on 32-bit kernels. + */ + if (!access_ok(VERIFY_READ, addr, 8)) + goto sigbus; + + LoadDW(addr, value, res); + if (res) + goto fault; + MIPS16e_compute_return_epc(regs, &oldinst); + regs->regs[reg] = value; + break; +#endif /* CONFIG_64BIT */ + + /* Cannot handle 64-bit instructions in 32-bit kernel */ + goto sigill; + + case MIPS16e_sh_op: + if (!access_ok(VERIFY_WRITE, addr, 2)) + goto sigbus; + + MIPS16e_compute_return_epc(regs, &oldinst); + value = regs->regs[reg]; + StoreHW(addr, value, res); + if (res) + goto fault; + break; + + case MIPS16e_sw_op: + case MIPS16e_swsp_op: + case MIPS16e_i8_op: /* actually - MIPS16e_swrasp_func */ + if (!access_ok(VERIFY_WRITE, addr, 4)) + goto sigbus; + + MIPS16e_compute_return_epc(regs, &oldinst); + value = regs->regs[reg]; + StoreW(addr, value, res); + if (res) + goto fault; + break; + + case MIPS16e_sd_op: +writeDW: +#ifdef CONFIG_64BIT + /* + * A 32-bit kernel might be running on a 64-bit processor. But + * if we're on a 32-bit processor and an i-cache incoherency + * or race makes us see a 64-bit instruction here the sdl/sdr + * would blow up, so for now we don't handle unaligned 64-bit + * instructions on 32-bit kernels. + */ + if (!access_ok(VERIFY_WRITE, addr, 8)) + goto sigbus; + + MIPS16e_compute_return_epc(regs, &oldinst); + value = regs->regs[reg]; + StoreDW(addr, value, res); + if (res) + goto fault; + break; +#endif /* CONFIG_64BIT */ + + /* Cannot handle 64-bit instructions in 32-bit kernel */ + goto sigill; + + default: + /* + * Pheeee... We encountered an yet unknown instruction or + * cache coherence problem. Die sucker, die ... + */ + goto sigill; + } + +#ifdef CONFIG_DEBUG_FS + unaligned_instructions++; +#endif + + return; + +fault: + /* roll back jump/branch */ + regs->cp0_epc = origpc; + regs->regs[31] = orig31; + /* Did we have an exception handler installed? */ + if (fixup_exception(regs)) + return; + + die_if_kernel("Unhandled kernel unaligned access", regs); + force_sig(SIGSEGV, current); + + return; + +sigbus: + die_if_kernel("Unhandled kernel unaligned access", regs); + force_sig(SIGBUS, current); + + return; + +sigill: + die_if_kernel + ("Unhandled kernel unaligned access or invalid instruction", regs); + force_sig(SIGILL, current); +} asmlinkage void do_ade(struct pt_regs *regs) { unsigned int __user *pc; @@ -517,23 +1557,62 @@ asmlinkage void do_ade(struct pt_regs *regs) 1, regs, regs->cp0_badvaddr); /* * Did we catch a fault trying to load an instruction? - * Or are we running in MIPS16 mode? */ - if ((regs->cp0_badvaddr == regs->cp0_epc) || (regs->cp0_epc & 0x1)) + if (regs->cp0_badvaddr == regs->cp0_epc) goto sigbus; - pc = (unsigned int __user *) exception_epc(regs); if (user_mode(regs) && !test_thread_flag(TIF_FIXADE)) goto sigbus; if (unaligned_action == UNALIGNED_ACTION_SIGNAL) goto sigbus; - else if (unaligned_action == UNALIGNED_ACTION_SHOW) - show_registers(regs); /* * Do branch emulation only if we didn't forward the exception. * This is all so but ugly ... */ + + /* + * Are we running in microMIPS mode? + */ + if (get_isa16_mode(regs->cp0_epc)) { + /* + * Did we catch a fault trying to load an instruction in + * 16-bit mode? + */ + if (regs->cp0_badvaddr == msk_isa16_mode(regs->cp0_epc)) + goto sigbus; + if (unaligned_action == UNALIGNED_ACTION_SHOW) + show_registers(regs); + + if (cpu_has_mmips) { + seg = get_fs(); + if (!user_mode(regs)) + set_fs(KERNEL_DS); + emulate_load_store_microMIPS(regs, + (void __user *)regs->cp0_badvaddr); + set_fs(seg); + + return; + } + + if (cpu_has_mips16) { + seg = get_fs(); + if (!user_mode(regs)) + set_fs(KERNEL_DS); + emulate_load_store_MIPS16e(regs, + (void __user *)regs->cp0_badvaddr); + set_fs(seg); + + return; + } + + goto sigbus; + } + + if (unaligned_action == UNALIGNED_ACTION_SHOW) + show_registers(regs); + pc = (unsigned int __user *)exception_epc(regs); + seg = get_fs(); if (!user_mode(regs)) set_fs(KERNEL_DS); diff --git a/arch/mips/kvm/00README.txt b/arch/mips/kvm/00README.txt new file mode 100644 index 000000000000..51617e481aa3 --- /dev/null +++ b/arch/mips/kvm/00README.txt @@ -0,0 +1,31 @@ +KVM/MIPS Trap & Emulate Release Notes +===================================== + +(1) KVM/MIPS should support MIPS32R2 and beyond. It has been tested on the following platforms: + Malta Board with FPGA based 34K + Sigma Designs TangoX board with a 24K based 8654 SoC. + Malta Board with 74K @ 1GHz + +(2) Both Guest kernel and Guest Userspace execute in UM. + Guest User address space: 0x00000000 -> 0x40000000 + Guest Kernel Unmapped: 0x40000000 -> 0x60000000 + Guest Kernel Mapped: 0x60000000 -> 0x80000000 + + Guest Usermode virtual memory is limited to 1GB. + +(2) 16K Page Sizes: Both Host Kernel and Guest Kernel should have the same page size, currently at least 16K. + Note that due to cache aliasing issues, 4K page sizes are NOT supported. + +(3) No HugeTLB Support + Both the host kernel and Guest kernel should have the page size set to 16K. + This will be implemented in a future release. + +(4) KVM/MIPS does not have support for SMP Guests + Linux-3.7-rc2 based SMP guest hangs due to the following code sequence in the generated TLB handlers: + LL/TLBP/SC. Since the TLBP instruction causes a trap the reservation gets cleared + when we ERET back to the guest. This causes the guest to hang in an infinite loop. + This will be fixed in a future release. + +(5) Use Host FPU + Currently KVM/MIPS emulates a 24K CPU without a FPU. + This will be fixed in a future release diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig new file mode 100644 index 000000000000..2c15590e55f7 --- /dev/null +++ b/arch/mips/kvm/Kconfig @@ -0,0 +1,49 @@ +# +# KVM configuration +# +source "virt/kvm/Kconfig" + +menuconfig VIRTUALIZATION + bool "Virtualization" + depends on HAVE_KVM + ---help--- + Say Y here to get to see options for using your Linux host to run + other operating systems inside virtual machines (guests). + This option alone does not add any kernel code. + + If you say N, all options in this submenu will be skipped and disabled. + +if VIRTUALIZATION + +config KVM + tristate "Kernel-based Virtual Machine (KVM) support" + depends on HAVE_KVM + select PREEMPT_NOTIFIERS + select ANON_INODES + select KVM_MMIO + ---help--- + Support for hosting Guest kernels. + Currently supported on MIPS32 processors. + +config KVM_MIPS_DYN_TRANS + bool "KVM/MIPS: Dynamic binary translation to reduce traps" + depends on KVM + ---help--- + When running in Trap & Emulate mode patch privileged + instructions to reduce the number of traps. + + If unsure, say Y. + +config KVM_MIPS_DEBUG_COP0_COUNTERS + bool "Maintain counters for COP0 accesses" + depends on KVM + ---help--- + Maintain statistics for Guest COP0 accesses. + A histogram of COP0 accesses is printed when the VM is + shutdown. + + If unsure, say N. + +source drivers/vhost/Kconfig + +endif # VIRTUALIZATION diff --git a/arch/mips/kvm/Makefile b/arch/mips/kvm/Makefile new file mode 100644 index 000000000000..78d87bbc99db --- /dev/null +++ b/arch/mips/kvm/Makefile @@ -0,0 +1,13 @@ +# Makefile for KVM support for MIPS +# + +common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o) + +EXTRA_CFLAGS += -Ivirt/kvm -Iarch/mips/kvm + +kvm-objs := $(common-objs) kvm_mips.o kvm_mips_emul.o kvm_locore.o \ + kvm_mips_int.o kvm_mips_stats.o kvm_mips_commpage.o \ + kvm_mips_dyntrans.o kvm_trap_emul.o + +obj-$(CONFIG_KVM) += kvm.o +obj-y += kvm_cb.o kvm_tlb.o diff --git a/arch/mips/kvm/kvm_cb.c b/arch/mips/kvm/kvm_cb.c new file mode 100644 index 000000000000..313c2e37b978 --- /dev/null +++ b/arch/mips/kvm/kvm_cb.c @@ -0,0 +1,14 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. + * Authors: Yann Le Du + */ + +#include +#include + +struct kvm_mips_callbacks *kvm_mips_callbacks; +EXPORT_SYMBOL(kvm_mips_callbacks); diff --git a/arch/mips/kvm/kvm_locore.S b/arch/mips/kvm/kvm_locore.S new file mode 100644 index 000000000000..dca2aa665993 --- /dev/null +++ b/arch/mips/kvm/kvm_locore.S @@ -0,0 +1,650 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* Main entry point for the guest, exception handling. +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#include +#include +#include +#include +#include +#include + + +#define _C_LABEL(x) x +#define MIPSX(name) mips32_ ## name +#define CALLFRAME_SIZ 32 + +/* + * VECTOR + * exception vector entrypoint + */ +#define VECTOR(x, regmask) \ + .ent _C_LABEL(x),0; \ + EXPORT(x); + +#define VECTOR_END(x) \ + EXPORT(x); + +/* Overload, Danger Will Robinson!! */ +#define PT_HOST_ASID PT_BVADDR +#define PT_HOST_USERLOCAL PT_EPC + +#define CP0_DDATA_LO $28,3 +#define CP0_EBASE $15,1 + +#define CP0_INTCTL $12,1 +#define CP0_SRSCTL $12,2 +#define CP0_SRSMAP $12,3 +#define CP0_HWRENA $7,0 + +/* Resume Flags */ +#define RESUME_FLAG_HOST (1<<1) /* Resume host? */ + +#define RESUME_GUEST 0 +#define RESUME_HOST RESUME_FLAG_HOST + +/* + * __kvm_mips_vcpu_run: entry point to the guest + * a0: run + * a1: vcpu + */ + +FEXPORT(__kvm_mips_vcpu_run) + .set push + .set noreorder + .set noat + + /* k0/k1 not being used in host kernel context */ + addiu k1,sp, -PT_SIZE + LONG_S $0, PT_R0(k1) + LONG_S $1, PT_R1(k1) + LONG_S $2, PT_R2(k1) + LONG_S $3, PT_R3(k1) + + LONG_S $4, PT_R4(k1) + LONG_S $5, PT_R5(k1) + LONG_S $6, PT_R6(k1) + LONG_S $7, PT_R7(k1) + + LONG_S $8, PT_R8(k1) + LONG_S $9, PT_R9(k1) + LONG_S $10, PT_R10(k1) + LONG_S $11, PT_R11(k1) + LONG_S $12, PT_R12(k1) + LONG_S $13, PT_R13(k1) + LONG_S $14, PT_R14(k1) + LONG_S $15, PT_R15(k1) + LONG_S $16, PT_R16(k1) + LONG_S $17, PT_R17(k1) + + LONG_S $18, PT_R18(k1) + LONG_S $19, PT_R19(k1) + LONG_S $20, PT_R20(k1) + LONG_S $21, PT_R21(k1) + LONG_S $22, PT_R22(k1) + LONG_S $23, PT_R23(k1) + LONG_S $24, PT_R24(k1) + LONG_S $25, PT_R25(k1) + + /* XXXKYMA k0/k1 not saved, not being used if we got here through an ioctl() */ + + LONG_S $28, PT_R28(k1) + LONG_S $29, PT_R29(k1) + LONG_S $30, PT_R30(k1) + LONG_S $31, PT_R31(k1) + + /* Save hi/lo */ + mflo v0 + LONG_S v0, PT_LO(k1) + mfhi v1 + LONG_S v1, PT_HI(k1) + + /* Save host status */ + mfc0 v0, CP0_STATUS + LONG_S v0, PT_STATUS(k1) + + /* Save host ASID, shove it into the BVADDR location */ + mfc0 v1,CP0_ENTRYHI + andi v1, 0xff + LONG_S v1, PT_HOST_ASID(k1) + + /* Save DDATA_LO, will be used to store pointer to vcpu */ + mfc0 v1, CP0_DDATA_LO + LONG_S v1, PT_HOST_USERLOCAL(k1) + + /* DDATA_LO has pointer to vcpu */ + mtc0 a1,CP0_DDATA_LO + + /* Offset into vcpu->arch */ + addiu k1, a1, VCPU_HOST_ARCH + + /* Save the host stack to VCPU, used for exception processing when we exit from the Guest */ + LONG_S sp, VCPU_HOST_STACK(k1) + + /* Save the kernel gp as well */ + LONG_S gp, VCPU_HOST_GP(k1) + + /* Setup status register for running the guest in UM, interrupts are disabled */ + li k0,(ST0_EXL | KSU_USER| ST0_BEV) + mtc0 k0,CP0_STATUS + ehb + + /* load up the new EBASE */ + LONG_L k0, VCPU_GUEST_EBASE(k1) + mtc0 k0,CP0_EBASE + + /* Now that the new EBASE has been loaded, unset BEV, set interrupt mask as it was + * but make sure that timer interrupts are enabled + */ + li k0,(ST0_EXL | KSU_USER | ST0_IE) + andi v0, v0, ST0_IM + or k0, k0, v0 + mtc0 k0,CP0_STATUS + ehb + + + /* Set Guest EPC */ + LONG_L t0, VCPU_PC(k1) + mtc0 t0, CP0_EPC + +FEXPORT(__kvm_mips_load_asid) + /* Set the ASID for the Guest Kernel */ + sll t0, t0, 1 /* with kseg0 @ 0x40000000, kernel */ + /* addresses shift to 0x80000000 */ + bltz t0, 1f /* If kernel */ + addiu t1, k1, VCPU_GUEST_KERNEL_ASID /* (BD) */ + addiu t1, k1, VCPU_GUEST_USER_ASID /* else user */ +1: + /* t1: contains the base of the ASID array, need to get the cpu id */ + LONG_L t2, TI_CPU($28) /* smp_processor_id */ + sll t2, t2, 2 /* x4 */ + addu t3, t1, t2 + LONG_L k0, (t3) + andi k0, k0, 0xff + mtc0 k0,CP0_ENTRYHI + ehb + + /* Disable RDHWR access */ + mtc0 zero, CP0_HWRENA + + /* Now load up the Guest Context from VCPU */ + LONG_L $1, VCPU_R1(k1) + LONG_L $2, VCPU_R2(k1) + LONG_L $3, VCPU_R3(k1) + + LONG_L $4, VCPU_R4(k1) + LONG_L $5, VCPU_R5(k1) + LONG_L $6, VCPU_R6(k1) + LONG_L $7, VCPU_R7(k1) + + LONG_L $8, VCPU_R8(k1) + LONG_L $9, VCPU_R9(k1) + LONG_L $10, VCPU_R10(k1) + LONG_L $11, VCPU_R11(k1) + LONG_L $12, VCPU_R12(k1) + LONG_L $13, VCPU_R13(k1) + LONG_L $14, VCPU_R14(k1) + LONG_L $15, VCPU_R15(k1) + LONG_L $16, VCPU_R16(k1) + LONG_L $17, VCPU_R17(k1) + LONG_L $18, VCPU_R18(k1) + LONG_L $19, VCPU_R19(k1) + LONG_L $20, VCPU_R20(k1) + LONG_L $21, VCPU_R21(k1) + LONG_L $22, VCPU_R22(k1) + LONG_L $23, VCPU_R23(k1) + LONG_L $24, VCPU_R24(k1) + LONG_L $25, VCPU_R25(k1) + + /* k0/k1 loaded up later */ + + LONG_L $28, VCPU_R28(k1) + LONG_L $29, VCPU_R29(k1) + LONG_L $30, VCPU_R30(k1) + LONG_L $31, VCPU_R31(k1) + + /* Restore hi/lo */ + LONG_L k0, VCPU_LO(k1) + mtlo k0 + + LONG_L k0, VCPU_HI(k1) + mthi k0 + +FEXPORT(__kvm_mips_load_k0k1) + /* Restore the guest's k0/k1 registers */ + LONG_L k0, VCPU_R26(k1) + LONG_L k1, VCPU_R27(k1) + + /* Jump to guest */ + eret + .set pop + +VECTOR(MIPSX(exception), unknown) +/* + * Find out what mode we came from and jump to the proper handler. + */ + .set push + .set noat + .set noreorder + mtc0 k0, CP0_ERROREPC #01: Save guest k0 + ehb #02: + + mfc0 k0, CP0_EBASE #02: Get EBASE + srl k0, k0, 10 #03: Get rid of CPUNum + sll k0, k0, 10 #04 + LONG_S k1, 0x3000(k0) #05: Save k1 @ offset 0x3000 + addiu k0, k0, 0x2000 #06: Exception handler is installed @ offset 0x2000 + j k0 #07: jump to the function + nop #08: branch delay slot + .set push +VECTOR_END(MIPSX(exceptionEnd)) +.end MIPSX(exception) + +/* + * Generic Guest exception handler. We end up here when the guest + * does something that causes a trap to kernel mode. + * + */ +NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra) + .set push + .set noat + .set noreorder + + /* Get the VCPU pointer from DDTATA_LO */ + mfc0 k1, CP0_DDATA_LO + addiu k1, k1, VCPU_HOST_ARCH + + /* Start saving Guest context to VCPU */ + LONG_S $0, VCPU_R0(k1) + LONG_S $1, VCPU_R1(k1) + LONG_S $2, VCPU_R2(k1) + LONG_S $3, VCPU_R3(k1) + LONG_S $4, VCPU_R4(k1) + LONG_S $5, VCPU_R5(k1) + LONG_S $6, VCPU_R6(k1) + LONG_S $7, VCPU_R7(k1) + LONG_S $8, VCPU_R8(k1) + LONG_S $9, VCPU_R9(k1) + LONG_S $10, VCPU_R10(k1) + LONG_S $11, VCPU_R11(k1) + LONG_S $12, VCPU_R12(k1) + LONG_S $13, VCPU_R13(k1) + LONG_S $14, VCPU_R14(k1) + LONG_S $15, VCPU_R15(k1) + LONG_S $16, VCPU_R16(k1) + LONG_S $17,VCPU_R17(k1) + LONG_S $18, VCPU_R18(k1) + LONG_S $19, VCPU_R19(k1) + LONG_S $20, VCPU_R20(k1) + LONG_S $21, VCPU_R21(k1) + LONG_S $22, VCPU_R22(k1) + LONG_S $23, VCPU_R23(k1) + LONG_S $24, VCPU_R24(k1) + LONG_S $25, VCPU_R25(k1) + + /* Guest k0/k1 saved later */ + + LONG_S $28, VCPU_R28(k1) + LONG_S $29, VCPU_R29(k1) + LONG_S $30, VCPU_R30(k1) + LONG_S $31, VCPU_R31(k1) + + /* We need to save hi/lo and restore them on + * the way out + */ + mfhi t0 + LONG_S t0, VCPU_HI(k1) + + mflo t0 + LONG_S t0, VCPU_LO(k1) + + /* Finally save guest k0/k1 to VCPU */ + mfc0 t0, CP0_ERROREPC + LONG_S t0, VCPU_R26(k1) + + /* Get GUEST k1 and save it in VCPU */ + la t1, ~0x2ff + mfc0 t0, CP0_EBASE + and t0, t0, t1 + LONG_L t0, 0x3000(t0) + LONG_S t0, VCPU_R27(k1) + + /* Now that context has been saved, we can use other registers */ + + /* Restore vcpu */ + mfc0 a1, CP0_DDATA_LO + move s1, a1 + + /* Restore run (vcpu->run) */ + LONG_L a0, VCPU_RUN(a1) + /* Save pointer to run in s0, will be saved by the compiler */ + move s0, a0 + + + /* Save Host level EPC, BadVaddr and Cause to VCPU, useful to process the exception */ + mfc0 k0,CP0_EPC + LONG_S k0, VCPU_PC(k1) + + mfc0 k0, CP0_BADVADDR + LONG_S k0, VCPU_HOST_CP0_BADVADDR(k1) + + mfc0 k0, CP0_CAUSE + LONG_S k0, VCPU_HOST_CP0_CAUSE(k1) + + mfc0 k0, CP0_ENTRYHI + LONG_S k0, VCPU_HOST_ENTRYHI(k1) + + /* Now restore the host state just enough to run the handlers */ + + /* Swtich EBASE to the one used by Linux */ + /* load up the host EBASE */ + mfc0 v0, CP0_STATUS + + .set at + or k0, v0, ST0_BEV + .set noat + + mtc0 k0, CP0_STATUS + ehb + + LONG_L k0, VCPU_HOST_EBASE(k1) + mtc0 k0,CP0_EBASE + + + /* Now that the new EBASE has been loaded, unset BEV and KSU_USER */ + .set at + and v0, v0, ~(ST0_EXL | KSU_USER | ST0_IE) + or v0, v0, ST0_CU0 + .set noat + mtc0 v0, CP0_STATUS + ehb + + /* Load up host GP */ + LONG_L gp, VCPU_HOST_GP(k1) + + /* Need a stack before we can jump to "C" */ + LONG_L sp, VCPU_HOST_STACK(k1) + + /* Saved host state */ + addiu sp,sp, -PT_SIZE + + /* XXXKYMA do we need to load the host ASID, maybe not because the + * kernel entries are marked GLOBAL, need to verify + */ + + /* Restore host DDATA_LO */ + LONG_L k0, PT_HOST_USERLOCAL(sp) + mtc0 k0, CP0_DDATA_LO + + /* Restore RDHWR access */ + la k0, 0x2000000F + mtc0 k0, CP0_HWRENA + + /* Jump to handler */ +FEXPORT(__kvm_mips_jump_to_handler) + /* XXXKYMA: not sure if this is safe, how large is the stack?? */ + /* Now jump to the kvm_mips_handle_exit() to see if we can deal with this in the kernel */ + la t9,kvm_mips_handle_exit + jalr.hb t9 + addiu sp,sp, -CALLFRAME_SIZ /* BD Slot */ + + /* Return from handler Make sure interrupts are disabled */ + di + ehb + + /* XXXKYMA: k0/k1 could have been blown away if we processed an exception + * while we were handling the exception from the guest, reload k1 + */ + move k1, s1 + addiu k1, k1, VCPU_HOST_ARCH + + /* Check return value, should tell us if we are returning to the host (handle I/O etc) + * or resuming the guest + */ + andi t0, v0, RESUME_HOST + bnez t0, __kvm_mips_return_to_host + nop + +__kvm_mips_return_to_guest: + /* Put the saved pointer to vcpu (s1) back into the DDATA_LO Register */ + mtc0 s1, CP0_DDATA_LO + + /* Load up the Guest EBASE to minimize the window where BEV is set */ + LONG_L t0, VCPU_GUEST_EBASE(k1) + + /* Switch EBASE back to the one used by KVM */ + mfc0 v1, CP0_STATUS + .set at + or k0, v1, ST0_BEV + .set noat + mtc0 k0, CP0_STATUS + ehb + mtc0 t0,CP0_EBASE + + /* Setup status register for running guest in UM */ + .set at + or v1, v1, (ST0_EXL | KSU_USER | ST0_IE) + and v1, v1, ~ST0_CU0 + .set noat + mtc0 v1, CP0_STATUS + ehb + + + /* Set Guest EPC */ + LONG_L t0, VCPU_PC(k1) + mtc0 t0, CP0_EPC + + /* Set the ASID for the Guest Kernel */ + sll t0, t0, 1 /* with kseg0 @ 0x40000000, kernel */ + /* addresses shift to 0x80000000 */ + bltz t0, 1f /* If kernel */ + addiu t1, k1, VCPU_GUEST_KERNEL_ASID /* (BD) */ + addiu t1, k1, VCPU_GUEST_USER_ASID /* else user */ +1: + /* t1: contains the base of the ASID array, need to get the cpu id */ + LONG_L t2, TI_CPU($28) /* smp_processor_id */ + sll t2, t2, 2 /* x4 */ + addu t3, t1, t2 + LONG_L k0, (t3) + andi k0, k0, 0xff + mtc0 k0,CP0_ENTRYHI + ehb + + /* Disable RDHWR access */ + mtc0 zero, CP0_HWRENA + + /* load the guest context from VCPU and return */ + LONG_L $0, VCPU_R0(k1) + LONG_L $1, VCPU_R1(k1) + LONG_L $2, VCPU_R2(k1) + LONG_L $3, VCPU_R3(k1) + LONG_L $4, VCPU_R4(k1) + LONG_L $5, VCPU_R5(k1) + LONG_L $6, VCPU_R6(k1) + LONG_L $7, VCPU_R7(k1) + LONG_L $8, VCPU_R8(k1) + LONG_L $9, VCPU_R9(k1) + LONG_L $10, VCPU_R10(k1) + LONG_L $11, VCPU_R11(k1) + LONG_L $12, VCPU_R12(k1) + LONG_L $13, VCPU_R13(k1) + LONG_L $14, VCPU_R14(k1) + LONG_L $15, VCPU_R15(k1) + LONG_L $16, VCPU_R16(k1) + LONG_L $17, VCPU_R17(k1) + LONG_L $18, VCPU_R18(k1) + LONG_L $19, VCPU_R19(k1) + LONG_L $20, VCPU_R20(k1) + LONG_L $21, VCPU_R21(k1) + LONG_L $22, VCPU_R22(k1) + LONG_L $23, VCPU_R23(k1) + LONG_L $24, VCPU_R24(k1) + LONG_L $25, VCPU_R25(k1) + + /* $/k1 loaded later */ + LONG_L $28, VCPU_R28(k1) + LONG_L $29, VCPU_R29(k1) + LONG_L $30, VCPU_R30(k1) + LONG_L $31, VCPU_R31(k1) + +FEXPORT(__kvm_mips_skip_guest_restore) + LONG_L k0, VCPU_HI(k1) + mthi k0 + + LONG_L k0, VCPU_LO(k1) + mtlo k0 + + LONG_L k0, VCPU_R26(k1) + LONG_L k1, VCPU_R27(k1) + + eret + +__kvm_mips_return_to_host: + /* EBASE is already pointing to Linux */ + LONG_L k1, VCPU_HOST_STACK(k1) + addiu k1,k1, -PT_SIZE + + /* Restore host DDATA_LO */ + LONG_L k0, PT_HOST_USERLOCAL(k1) + mtc0 k0, CP0_DDATA_LO + + /* Restore host ASID */ + LONG_L k0, PT_HOST_ASID(sp) + andi k0, 0xff + mtc0 k0,CP0_ENTRYHI + ehb + + /* Load context saved on the host stack */ + LONG_L $0, PT_R0(k1) + LONG_L $1, PT_R1(k1) + + /* r2/v0 is the return code, shift it down by 2 (arithmetic) to recover the err code */ + sra k0, v0, 2 + move $2, k0 + + LONG_L $3, PT_R3(k1) + LONG_L $4, PT_R4(k1) + LONG_L $5, PT_R5(k1) + LONG_L $6, PT_R6(k1) + LONG_L $7, PT_R7(k1) + LONG_L $8, PT_R8(k1) + LONG_L $9, PT_R9(k1) + LONG_L $10, PT_R10(k1) + LONG_L $11, PT_R11(k1) + LONG_L $12, PT_R12(k1) + LONG_L $13, PT_R13(k1) + LONG_L $14, PT_R14(k1) + LONG_L $15, PT_R15(k1) + LONG_L $16, PT_R16(k1) + LONG_L $17, PT_R17(k1) + LONG_L $18, PT_R18(k1) + LONG_L $19, PT_R19(k1) + LONG_L $20, PT_R20(k1) + LONG_L $21, PT_R21(k1) + LONG_L $22, PT_R22(k1) + LONG_L $23, PT_R23(k1) + LONG_L $24, PT_R24(k1) + LONG_L $25, PT_R25(k1) + + /* Host k0/k1 were not saved */ + + LONG_L $28, PT_R28(k1) + LONG_L $29, PT_R29(k1) + LONG_L $30, PT_R30(k1) + + LONG_L k0, PT_HI(k1) + mthi k0 + + LONG_L k0, PT_LO(k1) + mtlo k0 + + /* Restore RDHWR access */ + la k0, 0x2000000F + mtc0 k0, CP0_HWRENA + + + /* Restore RA, which is the address we will return to */ + LONG_L ra, PT_R31(k1) + j ra + nop + + .set pop +VECTOR_END(MIPSX(GuestExceptionEnd)) +.end MIPSX(GuestException) + +MIPSX(exceptions): + #### + ##### The exception handlers. + ##### + .word _C_LABEL(MIPSX(GuestException)) # 0 + .word _C_LABEL(MIPSX(GuestException)) # 1 + .word _C_LABEL(MIPSX(GuestException)) # 2 + .word _C_LABEL(MIPSX(GuestException)) # 3 + .word _C_LABEL(MIPSX(GuestException)) # 4 + .word _C_LABEL(MIPSX(GuestException)) # 5 + .word _C_LABEL(MIPSX(GuestException)) # 6 + .word _C_LABEL(MIPSX(GuestException)) # 7 + .word _C_LABEL(MIPSX(GuestException)) # 8 + .word _C_LABEL(MIPSX(GuestException)) # 9 + .word _C_LABEL(MIPSX(GuestException)) # 10 + .word _C_LABEL(MIPSX(GuestException)) # 11 + .word _C_LABEL(MIPSX(GuestException)) # 12 + .word _C_LABEL(MIPSX(GuestException)) # 13 + .word _C_LABEL(MIPSX(GuestException)) # 14 + .word _C_LABEL(MIPSX(GuestException)) # 15 + .word _C_LABEL(MIPSX(GuestException)) # 16 + .word _C_LABEL(MIPSX(GuestException)) # 17 + .word _C_LABEL(MIPSX(GuestException)) # 18 + .word _C_LABEL(MIPSX(GuestException)) # 19 + .word _C_LABEL(MIPSX(GuestException)) # 20 + .word _C_LABEL(MIPSX(GuestException)) # 21 + .word _C_LABEL(MIPSX(GuestException)) # 22 + .word _C_LABEL(MIPSX(GuestException)) # 23 + .word _C_LABEL(MIPSX(GuestException)) # 24 + .word _C_LABEL(MIPSX(GuestException)) # 25 + .word _C_LABEL(MIPSX(GuestException)) # 26 + .word _C_LABEL(MIPSX(GuestException)) # 27 + .word _C_LABEL(MIPSX(GuestException)) # 28 + .word _C_LABEL(MIPSX(GuestException)) # 29 + .word _C_LABEL(MIPSX(GuestException)) # 30 + .word _C_LABEL(MIPSX(GuestException)) # 31 + + +/* This routine makes changes to the instruction stream effective to the hardware. + * It should be called after the instruction stream is written. + * On return, the new instructions are effective. + * Inputs: + * a0 = Start address of new instruction stream + * a1 = Size, in bytes, of new instruction stream + */ + +#define HW_SYNCI_Step $1 +LEAF(MIPSX(SyncICache)) + .set push + .set mips32r2 + beq a1, zero, 20f + nop + addu a1, a0, a1 + rdhwr v0, HW_SYNCI_Step + beq v0, zero, 20f + nop + +10: + synci 0(a0) + addu a0, a0, v0 + sltu v1, a0, a1 + bne v1, zero, 10b + nop + sync +20: + jr.hb ra + nop + .set pop +END(MIPSX(SyncICache)) diff --git a/arch/mips/kvm/kvm_mips.c b/arch/mips/kvm/kvm_mips.c new file mode 100644 index 000000000000..e0dad0289797 --- /dev/null +++ b/arch/mips/kvm/kvm_mips.c @@ -0,0 +1,958 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * KVM/MIPS: MIPS specific KVM APIs + * + * Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. + * Authors: Sanjay Lal +*/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "kvm_mips_int.h" +#include "kvm_mips_comm.h" + +#define CREATE_TRACE_POINTS +#include "trace.h" + +#ifndef VECTORSPACING +#define VECTORSPACING 0x100 /* for EI/VI mode */ +#endif + +#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU +struct kvm_stats_debugfs_item debugfs_entries[] = { + { "wait", VCPU_STAT(wait_exits) }, + { "cache", VCPU_STAT(cache_exits) }, + { "signal", VCPU_STAT(signal_exits) }, + { "interrupt", VCPU_STAT(int_exits) }, + { "cop_unsuable", VCPU_STAT(cop_unusable_exits) }, + { "tlbmod", VCPU_STAT(tlbmod_exits) }, + { "tlbmiss_ld", VCPU_STAT(tlbmiss_ld_exits) }, + { "tlbmiss_st", VCPU_STAT(tlbmiss_st_exits) }, + { "addrerr_st", VCPU_STAT(addrerr_st_exits) }, + { "addrerr_ld", VCPU_STAT(addrerr_ld_exits) }, + { "syscall", VCPU_STAT(syscall_exits) }, + { "resvd_inst", VCPU_STAT(resvd_inst_exits) }, + { "break_inst", VCPU_STAT(break_inst_exits) }, + { "flush_dcache", VCPU_STAT(flush_dcache_exits) }, + { "halt_wakeup", VCPU_STAT(halt_wakeup) }, + {NULL} +}; + +static int kvm_mips_reset_vcpu(struct kvm_vcpu *vcpu) +{ + int i; + for_each_possible_cpu(i) { + vcpu->arch.guest_kernel_asid[i] = 0; + vcpu->arch.guest_user_asid[i] = 0; + } + return 0; +} + +gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn) +{ + return gfn; +} + +/* XXXKYMA: We are simulatoring a processor that has the WII bit set in Config7, so we + * are "runnable" if interrupts are pending + */ +int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu) +{ + return !!(vcpu->arch.pending_exceptions); +} + +int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu) +{ + return 1; +} + +int kvm_arch_hardware_enable(void *garbage) +{ + return 0; +} + +void kvm_arch_hardware_disable(void *garbage) +{ +} + +int kvm_arch_hardware_setup(void) +{ + return 0; +} + +void kvm_arch_hardware_unsetup(void) +{ +} + +void kvm_arch_check_processor_compat(void *rtn) +{ + int *r = (int *)rtn; + *r = 0; + return; +} + +static void kvm_mips_init_tlbs(struct kvm *kvm) +{ + unsigned long wired; + + /* Add a wired entry to the TLB, it is used to map the commpage to the Guest kernel */ + wired = read_c0_wired(); + write_c0_wired(wired + 1); + mtc0_tlbw_hazard(); + kvm->arch.commpage_tlb = wired; + + kvm_debug("[%d] commpage TLB: %d\n", smp_processor_id(), + kvm->arch.commpage_tlb); +} + +static void kvm_mips_init_vm_percpu(void *arg) +{ + struct kvm *kvm = (struct kvm *)arg; + + kvm_mips_init_tlbs(kvm); + kvm_mips_callbacks->vm_init(kvm); + +} + +int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) +{ + if (atomic_inc_return(&kvm_mips_instance) == 1) { + kvm_info("%s: 1st KVM instance, setup host TLB parameters\n", + __func__); + on_each_cpu(kvm_mips_init_vm_percpu, kvm, 1); + } + + + return 0; +} + +void kvm_mips_free_vcpus(struct kvm *kvm) +{ + unsigned int i; + struct kvm_vcpu *vcpu; + + /* Put the pages we reserved for the guest pmap */ + for (i = 0; i < kvm->arch.guest_pmap_npages; i++) { + if (kvm->arch.guest_pmap[i] != KVM_INVALID_PAGE) + kvm_mips_release_pfn_clean(kvm->arch.guest_pmap[i]); + } + + if (kvm->arch.guest_pmap) + kfree(kvm->arch.guest_pmap); + + kvm_for_each_vcpu(i, vcpu, kvm) { + kvm_arch_vcpu_free(vcpu); + } + + mutex_lock(&kvm->lock); + + for (i = 0; i < atomic_read(&kvm->online_vcpus); i++) + kvm->vcpus[i] = NULL; + + atomic_set(&kvm->online_vcpus, 0); + + mutex_unlock(&kvm->lock); +} + +void kvm_arch_sync_events(struct kvm *kvm) +{ +} + +static void kvm_mips_uninit_tlbs(void *arg) +{ + /* Restore wired count */ + write_c0_wired(0); + mtc0_tlbw_hazard(); + /* Clear out all the TLBs */ + kvm_local_flush_tlb_all(); +} + +void kvm_arch_destroy_vm(struct kvm *kvm) +{ + kvm_mips_free_vcpus(kvm); + + /* If this is the last instance, restore wired count */ + if (atomic_dec_return(&kvm_mips_instance) == 0) { + kvm_info("%s: last KVM instance, restoring TLB parameters\n", + __func__); + on_each_cpu(kvm_mips_uninit_tlbs, NULL, 1); + } +} + +long +kvm_arch_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) +{ + return -EINVAL; +} + +void kvm_arch_free_memslot(struct kvm_memory_slot *free, + struct kvm_memory_slot *dont) +{ +} + +int kvm_arch_create_memslot(struct kvm_memory_slot *slot, unsigned long npages) +{ + return 0; +} + +int kvm_arch_prepare_memory_region(struct kvm *kvm, + struct kvm_memory_slot *memslot, + struct kvm_userspace_memory_region *mem, + enum kvm_mr_change change) +{ + return 0; +} + +void kvm_arch_commit_memory_region(struct kvm *kvm, + struct kvm_userspace_memory_region *mem, + const struct kvm_memory_slot *old, + enum kvm_mr_change change) +{ + unsigned long npages = 0; + int i, err = 0; + + kvm_debug("%s: kvm: %p slot: %d, GPA: %llx, size: %llx, QVA: %llx\n", + __func__, kvm, mem->slot, mem->guest_phys_addr, + mem->memory_size, mem->userspace_addr); + + /* Setup Guest PMAP table */ + if (!kvm->arch.guest_pmap) { + if (mem->slot == 0) + npages = mem->memory_size >> PAGE_SHIFT; + + if (npages) { + kvm->arch.guest_pmap_npages = npages; + kvm->arch.guest_pmap = + kzalloc(npages * sizeof(unsigned long), GFP_KERNEL); + + if (!kvm->arch.guest_pmap) { + kvm_err("Failed to allocate guest PMAP"); + err = -ENOMEM; + goto out; + } + + kvm_info + ("Allocated space for Guest PMAP Table (%ld pages) @ %p\n", + npages, kvm->arch.guest_pmap); + + /* Now setup the page table */ + for (i = 0; i < npages; i++) { + kvm->arch.guest_pmap[i] = KVM_INVALID_PAGE; + } + } + } +out: + return; +} + +void kvm_arch_flush_shadow_all(struct kvm *kvm) +{ +} + +void kvm_arch_flush_shadow_memslot(struct kvm *kvm, + struct kvm_memory_slot *slot) +{ +} + +void kvm_arch_flush_shadow(struct kvm *kvm) +{ +} + +struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) +{ + extern char mips32_exception[], mips32_exceptionEnd[]; + extern char mips32_GuestException[], mips32_GuestExceptionEnd[]; + int err, size, offset; + void *gebase; + int i; + + struct kvm_vcpu *vcpu = kzalloc(sizeof(struct kvm_vcpu), GFP_KERNEL); + + if (!vcpu) { + err = -ENOMEM; + goto out; + } + + err = kvm_vcpu_init(vcpu, kvm, id); + + if (err) + goto out_free_cpu; + + kvm_info("kvm @ %p: create cpu %d at %p\n", kvm, id, vcpu); + + /* Allocate space for host mode exception handlers that handle + * guest mode exits + */ + if (cpu_has_veic || cpu_has_vint) { + size = 0x200 + VECTORSPACING * 64; + } else { + size = 0x200; + } + + /* Save Linux EBASE */ + vcpu->arch.host_ebase = (void *)read_c0_ebase(); + + gebase = kzalloc(ALIGN(size, PAGE_SIZE), GFP_KERNEL); + + if (!gebase) { + err = -ENOMEM; + goto out_free_cpu; + } + kvm_info("Allocated %d bytes for KVM Exception Handlers @ %p\n", + ALIGN(size, PAGE_SIZE), gebase); + + /* Save new ebase */ + vcpu->arch.guest_ebase = gebase; + + /* Copy L1 Guest Exception handler to correct offset */ + + /* TLB Refill, EXL = 0 */ + memcpy(gebase, mips32_exception, + mips32_exceptionEnd - mips32_exception); + + /* General Exception Entry point */ + memcpy(gebase + 0x180, mips32_exception, + mips32_exceptionEnd - mips32_exception); + + /* For vectored interrupts poke the exception code @ all offsets 0-7 */ + for (i = 0; i < 8; i++) { + kvm_debug("L1 Vectored handler @ %p\n", + gebase + 0x200 + (i * VECTORSPACING)); + memcpy(gebase + 0x200 + (i * VECTORSPACING), mips32_exception, + mips32_exceptionEnd - mips32_exception); + } + + /* General handler, relocate to unmapped space for sanity's sake */ + offset = 0x2000; + kvm_info("Installing KVM Exception handlers @ %p, %#x bytes\n", + gebase + offset, + mips32_GuestExceptionEnd - mips32_GuestException); + + memcpy(gebase + offset, mips32_GuestException, + mips32_GuestExceptionEnd - mips32_GuestException); + + /* Invalidate the icache for these ranges */ + mips32_SyncICache((unsigned long) gebase, ALIGN(size, PAGE_SIZE)); + + /* Allocate comm page for guest kernel, a TLB will be reserved for mapping GVA @ 0xFFFF8000 to this page */ + vcpu->arch.kseg0_commpage = kzalloc(PAGE_SIZE << 1, GFP_KERNEL); + + if (!vcpu->arch.kseg0_commpage) { + err = -ENOMEM; + goto out_free_gebase; + } + + kvm_info("Allocated COMM page @ %p\n", vcpu->arch.kseg0_commpage); + kvm_mips_commpage_init(vcpu); + + /* Init */ + vcpu->arch.last_sched_cpu = -1; + + /* Start off the timer */ + kvm_mips_emulate_count(vcpu); + + return vcpu; + +out_free_gebase: + kfree(gebase); + +out_free_cpu: + kfree(vcpu); + +out: + return ERR_PTR(err); +} + +void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu) +{ + hrtimer_cancel(&vcpu->arch.comparecount_timer); + + kvm_vcpu_uninit(vcpu); + + kvm_mips_dump_stats(vcpu); + + if (vcpu->arch.guest_ebase) + kfree(vcpu->arch.guest_ebase); + + if (vcpu->arch.kseg0_commpage) + kfree(vcpu->arch.kseg0_commpage); + +} + +void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) +{ + kvm_arch_vcpu_free(vcpu); +} + +int +kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, + struct kvm_guest_debug *dbg) +{ + return -EINVAL; +} + +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ + int r = 0; + sigset_t sigsaved; + + if (vcpu->sigset_active) + sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved); + + if (vcpu->mmio_needed) { + if (!vcpu->mmio_is_write) + kvm_mips_complete_mmio_load(vcpu, run); + vcpu->mmio_needed = 0; + } + + /* Check if we have any exceptions/interrupts pending */ + kvm_mips_deliver_interrupts(vcpu, + kvm_read_c0_guest_cause(vcpu->arch.cop0)); + + local_irq_disable(); + kvm_guest_enter(); + + r = __kvm_mips_vcpu_run(run, vcpu); + + kvm_guest_exit(); + local_irq_enable(); + + if (vcpu->sigset_active) + sigprocmask(SIG_SETMASK, &sigsaved, NULL); + + return r; +} + +int +kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_mips_interrupt *irq) +{ + int intr = (int)irq->irq; + struct kvm_vcpu *dvcpu = NULL; + + if (intr == 3 || intr == -3 || intr == 4 || intr == -4) + kvm_debug("%s: CPU: %d, INTR: %d\n", __func__, irq->cpu, + (int)intr); + + if (irq->cpu == -1) + dvcpu = vcpu; + else + dvcpu = vcpu->kvm->vcpus[irq->cpu]; + + if (intr == 2 || intr == 3 || intr == 4) { + kvm_mips_callbacks->queue_io_int(dvcpu, irq); + + } else if (intr == -2 || intr == -3 || intr == -4) { + kvm_mips_callbacks->dequeue_io_int(dvcpu, irq); + } else { + kvm_err("%s: invalid interrupt ioctl (%d:%d)\n", __func__, + irq->cpu, irq->irq); + return -EINVAL; + } + + dvcpu->arch.wait = 0; + + if (waitqueue_active(&dvcpu->wq)) { + wake_up_interruptible(&dvcpu->wq); + } + + return 0; +} + +int +kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; +} + +int +kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; +} + +long +kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) +{ + struct kvm_vcpu *vcpu = filp->private_data; + void __user *argp = (void __user *)arg; + long r; + int intr; + + switch (ioctl) { + case KVM_NMI: + /* Treat the NMI as a CPU reset */ + r = kvm_mips_reset_vcpu(vcpu); + break; + case KVM_INTERRUPT: + { + struct kvm_mips_interrupt irq; + r = -EFAULT; + if (copy_from_user(&irq, argp, sizeof(irq))) + goto out; + + intr = (int)irq.irq; + + kvm_debug("[%d] %s: irq: %d\n", vcpu->vcpu_id, __func__, + irq.irq); + + r = kvm_vcpu_ioctl_interrupt(vcpu, &irq); + break; + } + default: + r = -EINVAL; + } + +out: + return r; +} + +/* + * Get (and clear) the dirty memory log for a memory slot. + */ +int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) +{ + struct kvm_memory_slot *memslot; + unsigned long ga, ga_end; + int is_dirty = 0; + int r; + unsigned long n; + + mutex_lock(&kvm->slots_lock); + + r = kvm_get_dirty_log(kvm, log, &is_dirty); + if (r) + goto out; + + /* If nothing is dirty, don't bother messing with page tables. */ + if (is_dirty) { + memslot = &kvm->memslots->memslots[log->slot]; + + ga = memslot->base_gfn << PAGE_SHIFT; + ga_end = ga + (memslot->npages << PAGE_SHIFT); + + printk("%s: dirty, ga: %#lx, ga_end %#lx\n", __func__, ga, + ga_end); + + n = kvm_dirty_bitmap_bytes(memslot); + memset(memslot->dirty_bitmap, 0, n); + } + + r = 0; +out: + mutex_unlock(&kvm->slots_lock); + return r; + +} + +long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) +{ + long r; + + switch (ioctl) { + default: + r = -EINVAL; + } + + return r; +} + +int kvm_arch_init(void *opaque) +{ + int ret; + + if (kvm_mips_callbacks) { + kvm_err("kvm: module already exists\n"); + return -EEXIST; + } + + ret = kvm_mips_emulation_init(&kvm_mips_callbacks); + + return ret; +} + +void kvm_arch_exit(void) +{ + kvm_mips_callbacks = NULL; +} + +int +kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) +{ + return -ENOTSUPP; +} + +int +kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) +{ + return -ENOTSUPP; +} + +int kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) +{ + return 0; +} + +int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) +{ + return -ENOTSUPP; +} + +int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) +{ + return -ENOTSUPP; +} + +int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) +{ + return VM_FAULT_SIGBUS; +} + +int kvm_dev_ioctl_check_extension(long ext) +{ + int r; + + switch (ext) { + case KVM_CAP_COALESCED_MMIO: + r = KVM_COALESCED_MMIO_PAGE_OFFSET; + break; + default: + r = 0; + break; + } + return r; + +} + +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) +{ + return kvm_mips_pending_timer(vcpu); +} + +int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu) +{ + int i; + struct mips_coproc *cop0; + + if (!vcpu) + return -1; + + printk("VCPU Register Dump:\n"); + printk("\tpc = 0x%08lx\n", vcpu->arch.pc);; + printk("\texceptions: %08lx\n", vcpu->arch.pending_exceptions); + + for (i = 0; i < 32; i += 4) { + printk("\tgpr%02d: %08lx %08lx %08lx %08lx\n", i, + vcpu->arch.gprs[i], + vcpu->arch.gprs[i + 1], + vcpu->arch.gprs[i + 2], vcpu->arch.gprs[i + 3]); + } + printk("\thi: 0x%08lx\n", vcpu->arch.hi); + printk("\tlo: 0x%08lx\n", vcpu->arch.lo); + + cop0 = vcpu->arch.cop0; + printk("\tStatus: 0x%08lx, Cause: 0x%08lx\n", + kvm_read_c0_guest_status(cop0), kvm_read_c0_guest_cause(cop0)); + + printk("\tEPC: 0x%08lx\n", kvm_read_c0_guest_epc(cop0)); + + return 0; +} + +int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) +{ + int i; + + for (i = 0; i < 32; i++) + vcpu->arch.gprs[i] = regs->gprs[i]; + + vcpu->arch.hi = regs->hi; + vcpu->arch.lo = regs->lo; + vcpu->arch.pc = regs->pc; + + return kvm_mips_callbacks->vcpu_ioctl_set_regs(vcpu, regs); +} + +int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) +{ + int i; + + for (i = 0; i < 32; i++) + regs->gprs[i] = vcpu->arch.gprs[i]; + + regs->hi = vcpu->arch.hi; + regs->lo = vcpu->arch.lo; + regs->pc = vcpu->arch.pc; + + return kvm_mips_callbacks->vcpu_ioctl_get_regs(vcpu, regs); +} + +void kvm_mips_comparecount_func(unsigned long data) +{ + struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; + + kvm_mips_callbacks->queue_timer_int(vcpu); + + vcpu->arch.wait = 0; + if (waitqueue_active(&vcpu->wq)) { + wake_up_interruptible(&vcpu->wq); + } +} + +/* + * low level hrtimer wake routine. + */ +enum hrtimer_restart kvm_mips_comparecount_wakeup(struct hrtimer *timer) +{ + struct kvm_vcpu *vcpu; + + vcpu = container_of(timer, struct kvm_vcpu, arch.comparecount_timer); + kvm_mips_comparecount_func((unsigned long) vcpu); + hrtimer_forward_now(&vcpu->arch.comparecount_timer, + ktime_set(0, MS_TO_NS(10))); + return HRTIMER_RESTART; +} + +int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) +{ + kvm_mips_callbacks->vcpu_init(vcpu); + hrtimer_init(&vcpu->arch.comparecount_timer, CLOCK_MONOTONIC, + HRTIMER_MODE_REL); + vcpu->arch.comparecount_timer.function = kvm_mips_comparecount_wakeup; + kvm_mips_init_shadow_tlb(vcpu); + return 0; +} + +void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) +{ + return; +} + +int +kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, struct kvm_translation *tr) +{ + return 0; +} + +/* Initial guest state */ +int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) +{ + return kvm_mips_callbacks->vcpu_setup(vcpu); +} + +static +void kvm_mips_set_c0_status(void) +{ + uint32_t status = read_c0_status(); + + if (cpu_has_fpu) + status |= (ST0_CU1); + + if (cpu_has_dsp) + status |= (ST0_MX); + + write_c0_status(status); + ehb(); +} + +/* + * Return value is in the form (errcode<<2 | RESUME_FLAG_HOST | RESUME_FLAG_NV) + */ +int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + uint32_t cause = vcpu->arch.host_cp0_cause; + uint32_t exccode = (cause >> CAUSEB_EXCCODE) & 0x1f; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long badvaddr = vcpu->arch.host_cp0_badvaddr; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + /* Set a default exit reason */ + run->exit_reason = KVM_EXIT_UNKNOWN; + run->ready_for_interrupt_injection = 1; + + /* Set the appropriate status bits based on host CPU features, before we hit the scheduler */ + kvm_mips_set_c0_status(); + + local_irq_enable(); + + kvm_debug("kvm_mips_handle_exit: cause: %#x, PC: %p, kvm_run: %p, kvm_vcpu: %p\n", + cause, opc, run, vcpu); + + /* Do a privilege check, if in UM most of these exit conditions end up + * causing an exception to be delivered to the Guest Kernel + */ + er = kvm_mips_check_privilege(cause, opc, run, vcpu); + if (er == EMULATE_PRIV_FAIL) { + goto skip_emul; + } else if (er == EMULATE_FAIL) { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + goto skip_emul; + } + + switch (exccode) { + case T_INT: + kvm_debug("[%d]T_INT @ %p\n", vcpu->vcpu_id, opc); + + ++vcpu->stat.int_exits; + trace_kvm_exit(vcpu, INT_EXITS); + + if (need_resched()) { + cond_resched(); + } + + ret = RESUME_GUEST; + break; + + case T_COP_UNUSABLE: + kvm_debug("T_COP_UNUSABLE: @ PC: %p\n", opc); + + ++vcpu->stat.cop_unusable_exits; + trace_kvm_exit(vcpu, COP_UNUSABLE_EXITS); + ret = kvm_mips_callbacks->handle_cop_unusable(vcpu); + /* XXXKYMA: Might need to return to user space */ + if (run->exit_reason == KVM_EXIT_IRQ_WINDOW_OPEN) { + ret = RESUME_HOST; + } + break; + + case T_TLB_MOD: + ++vcpu->stat.tlbmod_exits; + trace_kvm_exit(vcpu, TLBMOD_EXITS); + ret = kvm_mips_callbacks->handle_tlb_mod(vcpu); + break; + + case T_TLB_ST_MISS: + kvm_debug + ("TLB ST fault: cause %#x, status %#lx, PC: %p, BadVaddr: %#lx\n", + cause, kvm_read_c0_guest_status(vcpu->arch.cop0), opc, + badvaddr); + + ++vcpu->stat.tlbmiss_st_exits; + trace_kvm_exit(vcpu, TLBMISS_ST_EXITS); + ret = kvm_mips_callbacks->handle_tlb_st_miss(vcpu); + break; + + case T_TLB_LD_MISS: + kvm_debug("TLB LD fault: cause %#x, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); + + ++vcpu->stat.tlbmiss_ld_exits; + trace_kvm_exit(vcpu, TLBMISS_LD_EXITS); + ret = kvm_mips_callbacks->handle_tlb_ld_miss(vcpu); + break; + + case T_ADDR_ERR_ST: + ++vcpu->stat.addrerr_st_exits; + trace_kvm_exit(vcpu, ADDRERR_ST_EXITS); + ret = kvm_mips_callbacks->handle_addr_err_st(vcpu); + break; + + case T_ADDR_ERR_LD: + ++vcpu->stat.addrerr_ld_exits; + trace_kvm_exit(vcpu, ADDRERR_LD_EXITS); + ret = kvm_mips_callbacks->handle_addr_err_ld(vcpu); + break; + + case T_SYSCALL: + ++vcpu->stat.syscall_exits; + trace_kvm_exit(vcpu, SYSCALL_EXITS); + ret = kvm_mips_callbacks->handle_syscall(vcpu); + break; + + case T_RES_INST: + ++vcpu->stat.resvd_inst_exits; + trace_kvm_exit(vcpu, RESVD_INST_EXITS); + ret = kvm_mips_callbacks->handle_res_inst(vcpu); + break; + + case T_BREAK: + ++vcpu->stat.break_inst_exits; + trace_kvm_exit(vcpu, BREAK_INST_EXITS); + ret = kvm_mips_callbacks->handle_break(vcpu); + break; + + default: + kvm_err + ("Exception Code: %d, not yet handled, @ PC: %p, inst: 0x%08x BadVaddr: %#lx Status: %#lx\n", + exccode, opc, kvm_get_inst(opc, vcpu), badvaddr, + kvm_read_c0_guest_status(vcpu->arch.cop0)); + kvm_arch_vcpu_dump_regs(vcpu); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + break; + + } + +skip_emul: + local_irq_disable(); + + if (er == EMULATE_DONE && !(ret & RESUME_HOST)) + kvm_mips_deliver_interrupts(vcpu, cause); + + if (!(ret & RESUME_HOST)) { + /* Only check for signals if not already exiting to userspace */ + if (signal_pending(current)) { + run->exit_reason = KVM_EXIT_INTR; + ret = (-EINTR << 2) | RESUME_HOST; + ++vcpu->stat.signal_exits; + trace_kvm_exit(vcpu, SIGNAL_EXITS); + } + } + + return ret; +} + +int __init kvm_mips_init(void) +{ + int ret; + + ret = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE); + + if (ret) + return ret; + + /* On MIPS, kernel modules are executed from "mapped space", which requires TLBs. + * The TLB handling code is statically linked with the rest of the kernel (kvm_tlb.c) + * to avoid the possibility of double faulting. The issue is that the TLB code + * references routines that are part of the the KVM module, + * which are only available once the module is loaded. + */ + kvm_mips_gfn_to_pfn = gfn_to_pfn; + kvm_mips_release_pfn_clean = kvm_release_pfn_clean; + kvm_mips_is_error_pfn = is_error_pfn; + + pr_info("KVM/MIPS Initialized\n"); + return 0; +} + +void __exit kvm_mips_exit(void) +{ + kvm_exit(); + + kvm_mips_gfn_to_pfn = NULL; + kvm_mips_release_pfn_clean = NULL; + kvm_mips_is_error_pfn = NULL; + + pr_info("KVM/MIPS unloaded\n"); +} + +module_init(kvm_mips_init); +module_exit(kvm_mips_exit); + +EXPORT_TRACEPOINT_SYMBOL(kvm_exit); diff --git a/arch/mips/kvm/kvm_mips_comm.h b/arch/mips/kvm/kvm_mips_comm.h new file mode 100644 index 000000000000..a4a8c85cc8f7 --- /dev/null +++ b/arch/mips/kvm/kvm_mips_comm.h @@ -0,0 +1,23 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* KVM/MIPS: commpage: mapped into get kernel space +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#ifndef __KVM_MIPS_COMMPAGE_H__ +#define __KVM_MIPS_COMMPAGE_H__ + +struct kvm_mips_commpage { + struct mips_coproc cop0; /* COP0 state is mapped into Guest kernel via commpage */ +}; + +#define KVM_MIPS_COMM_EIDI_OFFSET 0x0 + +extern void kvm_mips_commpage_init(struct kvm_vcpu *vcpu); + +#endif /* __KVM_MIPS_COMMPAGE_H__ */ diff --git a/arch/mips/kvm/kvm_mips_commpage.c b/arch/mips/kvm/kvm_mips_commpage.c new file mode 100644 index 000000000000..3873b1ecc40f --- /dev/null +++ b/arch/mips/kvm/kvm_mips_commpage.c @@ -0,0 +1,37 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* commpage, currently used for Virtual COP0 registers. +* Mapped into the guest kernel @ 0x0. +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "kvm_mips_comm.h" + +void kvm_mips_commpage_init(struct kvm_vcpu *vcpu) +{ + struct kvm_mips_commpage *page = vcpu->arch.kseg0_commpage; + memset(page, 0, sizeof(struct kvm_mips_commpage)); + + /* Specific init values for fields */ + vcpu->arch.cop0 = &page->cop0; + memset(vcpu->arch.cop0, 0, sizeof(struct mips_coproc)); + + return; +} diff --git a/arch/mips/kvm/kvm_mips_dyntrans.c b/arch/mips/kvm/kvm_mips_dyntrans.c new file mode 100644 index 000000000000..96528e2d1ea6 --- /dev/null +++ b/arch/mips/kvm/kvm_mips_dyntrans.c @@ -0,0 +1,149 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* KVM/MIPS: Binary Patching for privileged instructions, reduces traps. +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#include +#include +#include +#include +#include +#include +#include + +#include "kvm_mips_comm.h" + +#define SYNCI_TEMPLATE 0x041f0000 +#define SYNCI_BASE(x) (((x) >> 21) & 0x1f) +#define SYNCI_OFFSET ((x) & 0xffff) + +#define LW_TEMPLATE 0x8c000000 +#define CLEAR_TEMPLATE 0x00000020 +#define SW_TEMPLATE 0xac000000 + +int +kvm_mips_trans_cache_index(uint32_t inst, uint32_t *opc, + struct kvm_vcpu *vcpu) +{ + int result = 0; + unsigned long kseg0_opc; + uint32_t synci_inst = 0x0; + + /* Replace the CACHE instruction, with a NOP */ + kseg0_opc = + CKSEG0ADDR(kvm_mips_translate_guest_kseg0_to_hpa + (vcpu, (unsigned long) opc)); + memcpy((void *)kseg0_opc, (void *)&synci_inst, sizeof(uint32_t)); + mips32_SyncICache(kseg0_opc, 32); + + return result; +} + +/* + * Address based CACHE instructions are transformed into synci(s). A little heavy + * for just D-cache invalidates, but avoids an expensive trap + */ +int +kvm_mips_trans_cache_va(uint32_t inst, uint32_t *opc, + struct kvm_vcpu *vcpu) +{ + int result = 0; + unsigned long kseg0_opc; + uint32_t synci_inst = SYNCI_TEMPLATE, base, offset; + + base = (inst >> 21) & 0x1f; + offset = inst & 0xffff; + synci_inst |= (base << 21); + synci_inst |= offset; + + kseg0_opc = + CKSEG0ADDR(kvm_mips_translate_guest_kseg0_to_hpa + (vcpu, (unsigned long) opc)); + memcpy((void *)kseg0_opc, (void *)&synci_inst, sizeof(uint32_t)); + mips32_SyncICache(kseg0_opc, 32); + + return result; +} + +int +kvm_mips_trans_mfc0(uint32_t inst, uint32_t *opc, struct kvm_vcpu *vcpu) +{ + int32_t rt, rd, sel; + uint32_t mfc0_inst; + unsigned long kseg0_opc, flags; + + rt = (inst >> 16) & 0x1f; + rd = (inst >> 11) & 0x1f; + sel = inst & 0x7; + + if ((rd == MIPS_CP0_ERRCTL) && (sel == 0)) { + mfc0_inst = CLEAR_TEMPLATE; + mfc0_inst |= ((rt & 0x1f) << 16); + } else { + mfc0_inst = LW_TEMPLATE; + mfc0_inst |= ((rt & 0x1f) << 16); + mfc0_inst |= + offsetof(struct mips_coproc, + reg[rd][sel]) + offsetof(struct kvm_mips_commpage, + cop0); + } + + if (KVM_GUEST_KSEGX(opc) == KVM_GUEST_KSEG0) { + kseg0_opc = + CKSEG0ADDR(kvm_mips_translate_guest_kseg0_to_hpa + (vcpu, (unsigned long) opc)); + memcpy((void *)kseg0_opc, (void *)&mfc0_inst, sizeof(uint32_t)); + mips32_SyncICache(kseg0_opc, 32); + } else if (KVM_GUEST_KSEGX((unsigned long) opc) == KVM_GUEST_KSEG23) { + local_irq_save(flags); + memcpy((void *)opc, (void *)&mfc0_inst, sizeof(uint32_t)); + mips32_SyncICache((unsigned long) opc, 32); + local_irq_restore(flags); + } else { + kvm_err("%s: Invalid address: %p\n", __func__, opc); + return -EFAULT; + } + + return 0; +} + +int +kvm_mips_trans_mtc0(uint32_t inst, uint32_t *opc, struct kvm_vcpu *vcpu) +{ + int32_t rt, rd, sel; + uint32_t mtc0_inst = SW_TEMPLATE; + unsigned long kseg0_opc, flags; + + rt = (inst >> 16) & 0x1f; + rd = (inst >> 11) & 0x1f; + sel = inst & 0x7; + + mtc0_inst |= ((rt & 0x1f) << 16); + mtc0_inst |= + offsetof(struct mips_coproc, + reg[rd][sel]) + offsetof(struct kvm_mips_commpage, cop0); + + if (KVM_GUEST_KSEGX(opc) == KVM_GUEST_KSEG0) { + kseg0_opc = + CKSEG0ADDR(kvm_mips_translate_guest_kseg0_to_hpa + (vcpu, (unsigned long) opc)); + memcpy((void *)kseg0_opc, (void *)&mtc0_inst, sizeof(uint32_t)); + mips32_SyncICache(kseg0_opc, 32); + } else if (KVM_GUEST_KSEGX((unsigned long) opc) == KVM_GUEST_KSEG23) { + local_irq_save(flags); + memcpy((void *)opc, (void *)&mtc0_inst, sizeof(uint32_t)); + mips32_SyncICache((unsigned long) opc, 32); + local_irq_restore(flags); + } else { + kvm_err("%s: Invalid address: %p\n", __func__, opc); + return -EFAULT; + } + + return 0; +} diff --git a/arch/mips/kvm/kvm_mips_emul.c b/arch/mips/kvm/kvm_mips_emul.c new file mode 100644 index 000000000000..2b2bac9a40aa --- /dev/null +++ b/arch/mips/kvm/kvm_mips_emul.c @@ -0,0 +1,1826 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* KVM/MIPS: Instruction/Exception emulation +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#undef CONFIG_MIPS_MT +#include +#define CONFIG_MIPS_MT + +#include "kvm_mips_opcode.h" +#include "kvm_mips_int.h" +#include "kvm_mips_comm.h" + +#include "trace.h" + +/* + * Compute the return address and do emulate branch simulation, if required. + * This function should be called only in branch delay slot active. + */ +unsigned long kvm_compute_return_epc(struct kvm_vcpu *vcpu, + unsigned long instpc) +{ + unsigned int dspcontrol; + union mips_instruction insn; + struct kvm_vcpu_arch *arch = &vcpu->arch; + long epc = instpc; + long nextpc = KVM_INVALID_INST; + + if (epc & 3) + goto unaligned; + + /* + * Read the instruction + */ + insn.word = kvm_get_inst((uint32_t *) epc, vcpu); + + if (insn.word == KVM_INVALID_INST) + return KVM_INVALID_INST; + + switch (insn.i_format.opcode) { + /* + * jr and jalr are in r_format format. + */ + case spec_op: + switch (insn.r_format.func) { + case jalr_op: + arch->gprs[insn.r_format.rd] = epc + 8; + /* Fall through */ + case jr_op: + nextpc = arch->gprs[insn.r_format.rs]; + break; + } + break; + + /* + * This group contains: + * bltz_op, bgez_op, bltzl_op, bgezl_op, + * bltzal_op, bgezal_op, bltzall_op, bgezall_op. + */ + case bcond_op: + switch (insn.i_format.rt) { + case bltz_op: + case bltzl_op: + if ((long)arch->gprs[insn.i_format.rs] < 0) + epc = epc + 4 + (insn.i_format.simmediate << 2); + else + epc += 8; + nextpc = epc; + break; + + case bgez_op: + case bgezl_op: + if ((long)arch->gprs[insn.i_format.rs] >= 0) + epc = epc + 4 + (insn.i_format.simmediate << 2); + else + epc += 8; + nextpc = epc; + break; + + case bltzal_op: + case bltzall_op: + arch->gprs[31] = epc + 8; + if ((long)arch->gprs[insn.i_format.rs] < 0) + epc = epc + 4 + (insn.i_format.simmediate << 2); + else + epc += 8; + nextpc = epc; + break; + + case bgezal_op: + case bgezall_op: + arch->gprs[31] = epc + 8; + if ((long)arch->gprs[insn.i_format.rs] >= 0) + epc = epc + 4 + (insn.i_format.simmediate << 2); + else + epc += 8; + nextpc = epc; + break; + case bposge32_op: + if (!cpu_has_dsp) + goto sigill; + + dspcontrol = rddsp(0x01); + + if (dspcontrol >= 32) { + epc = epc + 4 + (insn.i_format.simmediate << 2); + } else + epc += 8; + nextpc = epc; + break; + } + break; + + /* + * These are unconditional and in j_format. + */ + case jal_op: + arch->gprs[31] = instpc + 8; + case j_op: + epc += 4; + epc >>= 28; + epc <<= 28; + epc |= (insn.j_format.target << 2); + nextpc = epc; + break; + + /* + * These are conditional and in i_format. + */ + case beq_op: + case beql_op: + if (arch->gprs[insn.i_format.rs] == + arch->gprs[insn.i_format.rt]) + epc = epc + 4 + (insn.i_format.simmediate << 2); + else + epc += 8; + nextpc = epc; + break; + + case bne_op: + case bnel_op: + if (arch->gprs[insn.i_format.rs] != + arch->gprs[insn.i_format.rt]) + epc = epc + 4 + (insn.i_format.simmediate << 2); + else + epc += 8; + nextpc = epc; + break; + + case blez_op: /* not really i_format */ + case blezl_op: + /* rt field assumed to be zero */ + if ((long)arch->gprs[insn.i_format.rs] <= 0) + epc = epc + 4 + (insn.i_format.simmediate << 2); + else + epc += 8; + nextpc = epc; + break; + + case bgtz_op: + case bgtzl_op: + /* rt field assumed to be zero */ + if ((long)arch->gprs[insn.i_format.rs] > 0) + epc = epc + 4 + (insn.i_format.simmediate << 2); + else + epc += 8; + nextpc = epc; + break; + + /* + * And now the FPA/cp1 branch instructions. + */ + case cop1_op: + printk("%s: unsupported cop1_op\n", __func__); + break; + } + + return nextpc; + +unaligned: + printk("%s: unaligned epc\n", __func__); + return nextpc; + +sigill: + printk("%s: DSP branch but not DSP ASE\n", __func__); + return nextpc; +} + +enum emulation_result update_pc(struct kvm_vcpu *vcpu, uint32_t cause) +{ + unsigned long branch_pc; + enum emulation_result er = EMULATE_DONE; + + if (cause & CAUSEF_BD) { + branch_pc = kvm_compute_return_epc(vcpu, vcpu->arch.pc); + if (branch_pc == KVM_INVALID_INST) { + er = EMULATE_FAIL; + } else { + vcpu->arch.pc = branch_pc; + kvm_debug("BD update_pc(): New PC: %#lx\n", vcpu->arch.pc); + } + } else + vcpu->arch.pc += 4; + + kvm_debug("update_pc(): New PC: %#lx\n", vcpu->arch.pc); + + return er; +} + +/* Everytime the compare register is written to, we need to decide when to fire + * the timer that represents timer ticks to the GUEST. + * + */ +enum emulation_result kvm_mips_emulate_count(struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + enum emulation_result er = EMULATE_DONE; + + /* If COUNT is enabled */ + if (!(kvm_read_c0_guest_cause(cop0) & CAUSEF_DC)) { + hrtimer_try_to_cancel(&vcpu->arch.comparecount_timer); + hrtimer_start(&vcpu->arch.comparecount_timer, + ktime_set(0, MS_TO_NS(10)), HRTIMER_MODE_REL); + } else { + hrtimer_try_to_cancel(&vcpu->arch.comparecount_timer); + } + + return er; +} + +enum emulation_result kvm_mips_emul_eret(struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + enum emulation_result er = EMULATE_DONE; + + if (kvm_read_c0_guest_status(cop0) & ST0_EXL) { + kvm_debug("[%#lx] ERET to %#lx\n", vcpu->arch.pc, + kvm_read_c0_guest_epc(cop0)); + kvm_clear_c0_guest_status(cop0, ST0_EXL); + vcpu->arch.pc = kvm_read_c0_guest_epc(cop0); + + } else if (kvm_read_c0_guest_status(cop0) & ST0_ERL) { + kvm_clear_c0_guest_status(cop0, ST0_ERL); + vcpu->arch.pc = kvm_read_c0_guest_errorepc(cop0); + } else { + printk("[%#lx] ERET when MIPS_SR_EXL|MIPS_SR_ERL == 0\n", + vcpu->arch.pc); + er = EMULATE_FAIL; + } + + return er; +} + +enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu *vcpu) +{ + enum emulation_result er = EMULATE_DONE; + + kvm_debug("[%#lx] !!!WAIT!!! (%#lx)\n", vcpu->arch.pc, + vcpu->arch.pending_exceptions); + + ++vcpu->stat.wait_exits; + trace_kvm_exit(vcpu, WAIT_EXITS); + if (!vcpu->arch.pending_exceptions) { + vcpu->arch.wait = 1; + kvm_vcpu_block(vcpu); + + /* We we are runnable, then definitely go off to user space to check if any + * I/O interrupts are pending. + */ + if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) { + clear_bit(KVM_REQ_UNHALT, &vcpu->requests); + vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN; + } + } + + return er; +} + +/* XXXKYMA: Linux doesn't seem to use TLBR, return EMULATE_FAIL for now so that we can catch + * this, if things ever change + */ +enum emulation_result kvm_mips_emul_tlbr(struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + enum emulation_result er = EMULATE_FAIL; + uint32_t pc = vcpu->arch.pc; + + printk("[%#x] COP0_TLBR [%ld]\n", pc, kvm_read_c0_guest_index(cop0)); + return er; +} + +/* Write Guest TLB Entry @ Index */ +enum emulation_result kvm_mips_emul_tlbwi(struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + int index = kvm_read_c0_guest_index(cop0); + enum emulation_result er = EMULATE_DONE; + struct kvm_mips_tlb *tlb = NULL; + uint32_t pc = vcpu->arch.pc; + + if (index < 0 || index >= KVM_MIPS_GUEST_TLB_SIZE) { + printk("%s: illegal index: %d\n", __func__, index); + printk + ("[%#x] COP0_TLBWI [%d] (entryhi: %#lx, entrylo0: %#lx entrylo1: %#lx, mask: %#lx)\n", + pc, index, kvm_read_c0_guest_entryhi(cop0), + kvm_read_c0_guest_entrylo0(cop0), + kvm_read_c0_guest_entrylo1(cop0), + kvm_read_c0_guest_pagemask(cop0)); + index = (index & ~0x80000000) % KVM_MIPS_GUEST_TLB_SIZE; + } + + tlb = &vcpu->arch.guest_tlb[index]; +#if 1 + /* Probe the shadow host TLB for the entry being overwritten, if one matches, invalidate it */ + kvm_mips_host_tlb_inv(vcpu, tlb->tlb_hi); +#endif + + tlb->tlb_mask = kvm_read_c0_guest_pagemask(cop0); + tlb->tlb_hi = kvm_read_c0_guest_entryhi(cop0); + tlb->tlb_lo0 = kvm_read_c0_guest_entrylo0(cop0); + tlb->tlb_lo1 = kvm_read_c0_guest_entrylo1(cop0); + + kvm_debug + ("[%#x] COP0_TLBWI [%d] (entryhi: %#lx, entrylo0: %#lx entrylo1: %#lx, mask: %#lx)\n", + pc, index, kvm_read_c0_guest_entryhi(cop0), + kvm_read_c0_guest_entrylo0(cop0), kvm_read_c0_guest_entrylo1(cop0), + kvm_read_c0_guest_pagemask(cop0)); + + return er; +} + +/* Write Guest TLB Entry @ Random Index */ +enum emulation_result kvm_mips_emul_tlbwr(struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + enum emulation_result er = EMULATE_DONE; + struct kvm_mips_tlb *tlb = NULL; + uint32_t pc = vcpu->arch.pc; + int index; + +#if 1 + get_random_bytes(&index, sizeof(index)); + index &= (KVM_MIPS_GUEST_TLB_SIZE - 1); +#else + index = jiffies % KVM_MIPS_GUEST_TLB_SIZE; +#endif + + if (index < 0 || index >= KVM_MIPS_GUEST_TLB_SIZE) { + printk("%s: illegal index: %d\n", __func__, index); + return EMULATE_FAIL; + } + + tlb = &vcpu->arch.guest_tlb[index]; + +#if 1 + /* Probe the shadow host TLB for the entry being overwritten, if one matches, invalidate it */ + kvm_mips_host_tlb_inv(vcpu, tlb->tlb_hi); +#endif + + tlb->tlb_mask = kvm_read_c0_guest_pagemask(cop0); + tlb->tlb_hi = kvm_read_c0_guest_entryhi(cop0); + tlb->tlb_lo0 = kvm_read_c0_guest_entrylo0(cop0); + tlb->tlb_lo1 = kvm_read_c0_guest_entrylo1(cop0); + + kvm_debug + ("[%#x] COP0_TLBWR[%d] (entryhi: %#lx, entrylo0: %#lx entrylo1: %#lx)\n", + pc, index, kvm_read_c0_guest_entryhi(cop0), + kvm_read_c0_guest_entrylo0(cop0), + kvm_read_c0_guest_entrylo1(cop0)); + + return er; +} + +enum emulation_result kvm_mips_emul_tlbp(struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + long entryhi = kvm_read_c0_guest_entryhi(cop0); + enum emulation_result er = EMULATE_DONE; + uint32_t pc = vcpu->arch.pc; + int index = -1; + + index = kvm_mips_guest_tlb_lookup(vcpu, entryhi); + + kvm_write_c0_guest_index(cop0, index); + + kvm_debug("[%#x] COP0_TLBP (entryhi: %#lx), index: %d\n", pc, entryhi, + index); + + return er; +} + +enum emulation_result +kvm_mips_emulate_CP0(uint32_t inst, uint32_t *opc, uint32_t cause, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + enum emulation_result er = EMULATE_DONE; + int32_t rt, rd, copz, sel, co_bit, op; + uint32_t pc = vcpu->arch.pc; + unsigned long curr_pc; + + /* + * Update PC and hold onto current PC in case there is + * an error and we want to rollback the PC + */ + curr_pc = vcpu->arch.pc; + er = update_pc(vcpu, cause); + if (er == EMULATE_FAIL) { + return er; + } + + copz = (inst >> 21) & 0x1f; + rt = (inst >> 16) & 0x1f; + rd = (inst >> 11) & 0x1f; + sel = inst & 0x7; + co_bit = (inst >> 25) & 1; + + /* Verify that the register is valid */ + if (rd > MIPS_CP0_DESAVE) { + printk("Invalid rd: %d\n", rd); + er = EMULATE_FAIL; + goto done; + } + + if (co_bit) { + op = (inst) & 0xff; + + switch (op) { + case tlbr_op: /* Read indexed TLB entry */ + er = kvm_mips_emul_tlbr(vcpu); + break; + case tlbwi_op: /* Write indexed */ + er = kvm_mips_emul_tlbwi(vcpu); + break; + case tlbwr_op: /* Write random */ + er = kvm_mips_emul_tlbwr(vcpu); + break; + case tlbp_op: /* TLB Probe */ + er = kvm_mips_emul_tlbp(vcpu); + break; + case rfe_op: + printk("!!!COP0_RFE!!!\n"); + break; + case eret_op: + er = kvm_mips_emul_eret(vcpu); + goto dont_update_pc; + break; + case wait_op: + er = kvm_mips_emul_wait(vcpu); + break; + } + } else { + switch (copz) { + case mfc_op: +#ifdef CONFIG_KVM_MIPS_DEBUG_COP0_COUNTERS + cop0->stat[rd][sel]++; +#endif + /* Get reg */ + if ((rd == MIPS_CP0_COUNT) && (sel == 0)) { + /* XXXKYMA: Run the Guest count register @ 1/4 the rate of the host */ + vcpu->arch.gprs[rt] = (read_c0_count() >> 2); + } else if ((rd == MIPS_CP0_ERRCTL) && (sel == 0)) { + vcpu->arch.gprs[rt] = 0x0; +#ifdef CONFIG_KVM_MIPS_DYN_TRANS + kvm_mips_trans_mfc0(inst, opc, vcpu); +#endif + } + else { + vcpu->arch.gprs[rt] = cop0->reg[rd][sel]; + +#ifdef CONFIG_KVM_MIPS_DYN_TRANS + kvm_mips_trans_mfc0(inst, opc, vcpu); +#endif + } + + kvm_debug + ("[%#x] MFCz[%d][%d], vcpu->arch.gprs[%d]: %#lx\n", + pc, rd, sel, rt, vcpu->arch.gprs[rt]); + + break; + + case dmfc_op: + vcpu->arch.gprs[rt] = cop0->reg[rd][sel]; + break; + + case mtc_op: +#ifdef CONFIG_KVM_MIPS_DEBUG_COP0_COUNTERS + cop0->stat[rd][sel]++; +#endif + if ((rd == MIPS_CP0_TLB_INDEX) + && (vcpu->arch.gprs[rt] >= + KVM_MIPS_GUEST_TLB_SIZE)) { + printk("Invalid TLB Index: %ld", + vcpu->arch.gprs[rt]); + er = EMULATE_FAIL; + break; + } +#define C0_EBASE_CORE_MASK 0xff + if ((rd == MIPS_CP0_PRID) && (sel == 1)) { + /* Preserve CORE number */ + kvm_change_c0_guest_ebase(cop0, + ~(C0_EBASE_CORE_MASK), + vcpu->arch.gprs[rt]); + printk("MTCz, cop0->reg[EBASE]: %#lx\n", + kvm_read_c0_guest_ebase(cop0)); + } else if (rd == MIPS_CP0_TLB_HI && sel == 0) { + uint32_t nasid = ASID_MASK(vcpu->arch.gprs[rt]); + if ((KSEGX(vcpu->arch.gprs[rt]) != CKSEG0) + && + (ASID_MASK(kvm_read_c0_guest_entryhi(cop0)) + != nasid)) { + + kvm_debug + ("MTCz, change ASID from %#lx to %#lx\n", + ASID_MASK(kvm_read_c0_guest_entryhi(cop0)), + ASID_MASK(vcpu->arch.gprs[rt])); + + /* Blow away the shadow host TLBs */ + kvm_mips_flush_host_tlb(1); + } + kvm_write_c0_guest_entryhi(cop0, + vcpu->arch.gprs[rt]); + } + /* Are we writing to COUNT */ + else if ((rd == MIPS_CP0_COUNT) && (sel == 0)) { + /* Linux doesn't seem to write into COUNT, we throw an error + * if we notice a write to COUNT + */ + /*er = EMULATE_FAIL; */ + goto done; + } else if ((rd == MIPS_CP0_COMPARE) && (sel == 0)) { + kvm_debug("[%#x] MTCz, COMPARE %#lx <- %#lx\n", + pc, kvm_read_c0_guest_compare(cop0), + vcpu->arch.gprs[rt]); + + /* If we are writing to COMPARE */ + /* Clear pending timer interrupt, if any */ + kvm_mips_callbacks->dequeue_timer_int(vcpu); + kvm_write_c0_guest_compare(cop0, + vcpu->arch.gprs[rt]); + } else if ((rd == MIPS_CP0_STATUS) && (sel == 0)) { + kvm_write_c0_guest_status(cop0, + vcpu->arch.gprs[rt]); + /* Make sure that CU1 and NMI bits are never set */ + kvm_clear_c0_guest_status(cop0, + (ST0_CU1 | ST0_NMI)); + +#ifdef CONFIG_KVM_MIPS_DYN_TRANS + kvm_mips_trans_mtc0(inst, opc, vcpu); +#endif + } else { + cop0->reg[rd][sel] = vcpu->arch.gprs[rt]; +#ifdef CONFIG_KVM_MIPS_DYN_TRANS + kvm_mips_trans_mtc0(inst, opc, vcpu); +#endif + } + + kvm_debug("[%#x] MTCz, cop0->reg[%d][%d]: %#lx\n", pc, + rd, sel, cop0->reg[rd][sel]); + break; + + case dmtc_op: + printk + ("!!!!!!![%#lx]dmtc_op: rt: %d, rd: %d, sel: %d!!!!!!\n", + vcpu->arch.pc, rt, rd, sel); + er = EMULATE_FAIL; + break; + + case mfmcz_op: +#ifdef KVM_MIPS_DEBUG_COP0_COUNTERS + cop0->stat[MIPS_CP0_STATUS][0]++; +#endif + if (rt != 0) { + vcpu->arch.gprs[rt] = + kvm_read_c0_guest_status(cop0); + } + /* EI */ + if (inst & 0x20) { + kvm_debug("[%#lx] mfmcz_op: EI\n", + vcpu->arch.pc); + kvm_set_c0_guest_status(cop0, ST0_IE); + } else { + kvm_debug("[%#lx] mfmcz_op: DI\n", + vcpu->arch.pc); + kvm_clear_c0_guest_status(cop0, ST0_IE); + } + + break; + + case wrpgpr_op: + { + uint32_t css = + cop0->reg[MIPS_CP0_STATUS][2] & 0xf; + uint32_t pss = + (cop0->reg[MIPS_CP0_STATUS][2] >> 6) & 0xf; + /* We don't support any shadow register sets, so SRSCtl[PSS] == SRSCtl[CSS] = 0 */ + if (css || pss) { + er = EMULATE_FAIL; + break; + } + kvm_debug("WRPGPR[%d][%d] = %#lx\n", pss, rd, + vcpu->arch.gprs[rt]); + vcpu->arch.gprs[rd] = vcpu->arch.gprs[rt]; + } + break; + default: + printk + ("[%#lx]MachEmulateCP0: unsupported COP0, copz: 0x%x\n", + vcpu->arch.pc, copz); + er = EMULATE_FAIL; + break; + } + } + +done: + /* + * Rollback PC only if emulation was unsuccessful + */ + if (er == EMULATE_FAIL) { + vcpu->arch.pc = curr_pc; + } + +dont_update_pc: + /* + * This is for special instructions whose emulation + * updates the PC, so do not overwrite the PC under + * any circumstances + */ + + return er; +} + +enum emulation_result +kvm_mips_emulate_store(uint32_t inst, uint32_t cause, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + enum emulation_result er = EMULATE_DO_MMIO; + int32_t op, base, rt, offset; + uint32_t bytes; + void *data = run->mmio.data; + unsigned long curr_pc; + + /* + * Update PC and hold onto current PC in case there is + * an error and we want to rollback the PC + */ + curr_pc = vcpu->arch.pc; + er = update_pc(vcpu, cause); + if (er == EMULATE_FAIL) + return er; + + rt = (inst >> 16) & 0x1f; + base = (inst >> 21) & 0x1f; + offset = inst & 0xffff; + op = (inst >> 26) & 0x3f; + + switch (op) { + case sb_op: + bytes = 1; + if (bytes > sizeof(run->mmio.data)) { + kvm_err("%s: bad MMIO length: %d\n", __func__, + run->mmio.len); + } + run->mmio.phys_addr = + kvm_mips_callbacks->gva_to_gpa(vcpu->arch. + host_cp0_badvaddr); + if (run->mmio.phys_addr == KVM_INVALID_ADDR) { + er = EMULATE_FAIL; + break; + } + run->mmio.len = bytes; + run->mmio.is_write = 1; + vcpu->mmio_needed = 1; + vcpu->mmio_is_write = 1; + *(u8 *) data = vcpu->arch.gprs[rt]; + kvm_debug("OP_SB: eaddr: %#lx, gpr: %#lx, data: %#x\n", + vcpu->arch.host_cp0_badvaddr, vcpu->arch.gprs[rt], + *(uint8_t *) data); + + break; + + case sw_op: + bytes = 4; + if (bytes > sizeof(run->mmio.data)) { + kvm_err("%s: bad MMIO length: %d\n", __func__, + run->mmio.len); + } + run->mmio.phys_addr = + kvm_mips_callbacks->gva_to_gpa(vcpu->arch. + host_cp0_badvaddr); + if (run->mmio.phys_addr == KVM_INVALID_ADDR) { + er = EMULATE_FAIL; + break; + } + + run->mmio.len = bytes; + run->mmio.is_write = 1; + vcpu->mmio_needed = 1; + vcpu->mmio_is_write = 1; + *(uint32_t *) data = vcpu->arch.gprs[rt]; + + kvm_debug("[%#lx] OP_SW: eaddr: %#lx, gpr: %#lx, data: %#x\n", + vcpu->arch.pc, vcpu->arch.host_cp0_badvaddr, + vcpu->arch.gprs[rt], *(uint32_t *) data); + break; + + case sh_op: + bytes = 2; + if (bytes > sizeof(run->mmio.data)) { + kvm_err("%s: bad MMIO length: %d\n", __func__, + run->mmio.len); + } + run->mmio.phys_addr = + kvm_mips_callbacks->gva_to_gpa(vcpu->arch. + host_cp0_badvaddr); + if (run->mmio.phys_addr == KVM_INVALID_ADDR) { + er = EMULATE_FAIL; + break; + } + + run->mmio.len = bytes; + run->mmio.is_write = 1; + vcpu->mmio_needed = 1; + vcpu->mmio_is_write = 1; + *(uint16_t *) data = vcpu->arch.gprs[rt]; + + kvm_debug("[%#lx] OP_SH: eaddr: %#lx, gpr: %#lx, data: %#x\n", + vcpu->arch.pc, vcpu->arch.host_cp0_badvaddr, + vcpu->arch.gprs[rt], *(uint32_t *) data); + break; + + default: + printk("Store not yet supported"); + er = EMULATE_FAIL; + break; + } + + /* + * Rollback PC if emulation was unsuccessful + */ + if (er == EMULATE_FAIL) { + vcpu->arch.pc = curr_pc; + } + + return er; +} + +enum emulation_result +kvm_mips_emulate_load(uint32_t inst, uint32_t cause, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + enum emulation_result er = EMULATE_DO_MMIO; + int32_t op, base, rt, offset; + uint32_t bytes; + + rt = (inst >> 16) & 0x1f; + base = (inst >> 21) & 0x1f; + offset = inst & 0xffff; + op = (inst >> 26) & 0x3f; + + vcpu->arch.pending_load_cause = cause; + vcpu->arch.io_gpr = rt; + + switch (op) { + case lw_op: + bytes = 4; + if (bytes > sizeof(run->mmio.data)) { + kvm_err("%s: bad MMIO length: %d\n", __func__, + run->mmio.len); + er = EMULATE_FAIL; + break; + } + run->mmio.phys_addr = + kvm_mips_callbacks->gva_to_gpa(vcpu->arch. + host_cp0_badvaddr); + if (run->mmio.phys_addr == KVM_INVALID_ADDR) { + er = EMULATE_FAIL; + break; + } + + run->mmio.len = bytes; + run->mmio.is_write = 0; + vcpu->mmio_needed = 1; + vcpu->mmio_is_write = 0; + break; + + case lh_op: + case lhu_op: + bytes = 2; + if (bytes > sizeof(run->mmio.data)) { + kvm_err("%s: bad MMIO length: %d\n", __func__, + run->mmio.len); + er = EMULATE_FAIL; + break; + } + run->mmio.phys_addr = + kvm_mips_callbacks->gva_to_gpa(vcpu->arch. + host_cp0_badvaddr); + if (run->mmio.phys_addr == KVM_INVALID_ADDR) { + er = EMULATE_FAIL; + break; + } + + run->mmio.len = bytes; + run->mmio.is_write = 0; + vcpu->mmio_needed = 1; + vcpu->mmio_is_write = 0; + + if (op == lh_op) + vcpu->mmio_needed = 2; + else + vcpu->mmio_needed = 1; + + break; + + case lbu_op: + case lb_op: + bytes = 1; + if (bytes > sizeof(run->mmio.data)) { + kvm_err("%s: bad MMIO length: %d\n", __func__, + run->mmio.len); + er = EMULATE_FAIL; + break; + } + run->mmio.phys_addr = + kvm_mips_callbacks->gva_to_gpa(vcpu->arch. + host_cp0_badvaddr); + if (run->mmio.phys_addr == KVM_INVALID_ADDR) { + er = EMULATE_FAIL; + break; + } + + run->mmio.len = bytes; + run->mmio.is_write = 0; + vcpu->mmio_is_write = 0; + + if (op == lb_op) + vcpu->mmio_needed = 2; + else + vcpu->mmio_needed = 1; + + break; + + default: + printk("Load not yet supported"); + er = EMULATE_FAIL; + break; + } + + return er; +} + +int kvm_mips_sync_icache(unsigned long va, struct kvm_vcpu *vcpu) +{ + unsigned long offset = (va & ~PAGE_MASK); + struct kvm *kvm = vcpu->kvm; + unsigned long pa; + gfn_t gfn; + pfn_t pfn; + + gfn = va >> PAGE_SHIFT; + + if (gfn >= kvm->arch.guest_pmap_npages) { + printk("%s: Invalid gfn: %#llx\n", __func__, gfn); + kvm_mips_dump_host_tlbs(); + kvm_arch_vcpu_dump_regs(vcpu); + return -1; + } + pfn = kvm->arch.guest_pmap[gfn]; + pa = (pfn << PAGE_SHIFT) | offset; + + printk("%s: va: %#lx, unmapped: %#x\n", __func__, va, CKSEG0ADDR(pa)); + + mips32_SyncICache(CKSEG0ADDR(pa), 32); + return 0; +} + +#define MIPS_CACHE_OP_INDEX_INV 0x0 +#define MIPS_CACHE_OP_INDEX_LD_TAG 0x1 +#define MIPS_CACHE_OP_INDEX_ST_TAG 0x2 +#define MIPS_CACHE_OP_IMP 0x3 +#define MIPS_CACHE_OP_HIT_INV 0x4 +#define MIPS_CACHE_OP_FILL_WB_INV 0x5 +#define MIPS_CACHE_OP_HIT_HB 0x6 +#define MIPS_CACHE_OP_FETCH_LOCK 0x7 + +#define MIPS_CACHE_ICACHE 0x0 +#define MIPS_CACHE_DCACHE 0x1 +#define MIPS_CACHE_SEC 0x3 + +enum emulation_result +kvm_mips_emulate_cache(uint32_t inst, uint32_t *opc, uint32_t cause, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + extern void (*r4k_blast_dcache) (void); + extern void (*r4k_blast_icache) (void); + enum emulation_result er = EMULATE_DONE; + int32_t offset, cache, op_inst, op, base; + struct kvm_vcpu_arch *arch = &vcpu->arch; + unsigned long va; + unsigned long curr_pc; + + /* + * Update PC and hold onto current PC in case there is + * an error and we want to rollback the PC + */ + curr_pc = vcpu->arch.pc; + er = update_pc(vcpu, cause); + if (er == EMULATE_FAIL) + return er; + + base = (inst >> 21) & 0x1f; + op_inst = (inst >> 16) & 0x1f; + offset = inst & 0xffff; + cache = (inst >> 16) & 0x3; + op = (inst >> 18) & 0x7; + + va = arch->gprs[base] + offset; + + kvm_debug("CACHE (cache: %#x, op: %#x, base[%d]: %#lx, offset: %#x\n", + cache, op, base, arch->gprs[base], offset); + + /* Treat INDEX_INV as a nop, basically issued by Linux on startup to invalidate + * the caches entirely by stepping through all the ways/indexes + */ + if (op == MIPS_CACHE_OP_INDEX_INV) { + kvm_debug + ("@ %#lx/%#lx CACHE (cache: %#x, op: %#x, base[%d]: %#lx, offset: %#x\n", + vcpu->arch.pc, vcpu->arch.gprs[31], cache, op, base, + arch->gprs[base], offset); + + if (cache == MIPS_CACHE_DCACHE) + r4k_blast_dcache(); + else if (cache == MIPS_CACHE_ICACHE) + r4k_blast_icache(); + else { + printk("%s: unsupported CACHE INDEX operation\n", + __func__); + return EMULATE_FAIL; + } + +#ifdef CONFIG_KVM_MIPS_DYN_TRANS + kvm_mips_trans_cache_index(inst, opc, vcpu); +#endif + goto done; + } + + preempt_disable(); + if (KVM_GUEST_KSEGX(va) == KVM_GUEST_KSEG0) { + + if (kvm_mips_host_tlb_lookup(vcpu, va) < 0) { + kvm_mips_handle_kseg0_tlb_fault(va, vcpu); + } + } else if ((KVM_GUEST_KSEGX(va) < KVM_GUEST_KSEG0) || + KVM_GUEST_KSEGX(va) == KVM_GUEST_KSEG23) { + int index; + + /* If an entry already exists then skip */ + if (kvm_mips_host_tlb_lookup(vcpu, va) >= 0) { + goto skip_fault; + } + + /* If address not in the guest TLB, then give the guest a fault, the + * resulting handler will do the right thing + */ + index = kvm_mips_guest_tlb_lookup(vcpu, (va & VPN2_MASK) | + ASID_MASK(kvm_read_c0_guest_entryhi(cop0))); + + if (index < 0) { + vcpu->arch.host_cp0_entryhi = (va & VPN2_MASK); + vcpu->arch.host_cp0_badvaddr = va; + er = kvm_mips_emulate_tlbmiss_ld(cause, NULL, run, + vcpu); + preempt_enable(); + goto dont_update_pc; + } else { + struct kvm_mips_tlb *tlb = &vcpu->arch.guest_tlb[index]; + /* Check if the entry is valid, if not then setup a TLB invalid exception to the guest */ + if (!TLB_IS_VALID(*tlb, va)) { + er = kvm_mips_emulate_tlbinv_ld(cause, NULL, + run, vcpu); + preempt_enable(); + goto dont_update_pc; + } else { + /* We fault an entry from the guest tlb to the shadow host TLB */ + kvm_mips_handle_mapped_seg_tlb_fault(vcpu, tlb, + NULL, + NULL); + } + } + } else { + printk + ("INVALID CACHE INDEX/ADDRESS (cache: %#x, op: %#x, base[%d]: %#lx, offset: %#x\n", + cache, op, base, arch->gprs[base], offset); + er = EMULATE_FAIL; + preempt_enable(); + goto dont_update_pc; + + } + +skip_fault: + /* XXXKYMA: Only a subset of cache ops are supported, used by Linux */ + if (cache == MIPS_CACHE_DCACHE + && (op == MIPS_CACHE_OP_FILL_WB_INV + || op == MIPS_CACHE_OP_HIT_INV)) { + flush_dcache_line(va); + +#ifdef CONFIG_KVM_MIPS_DYN_TRANS + /* Replace the CACHE instruction, with a SYNCI, not the same, but avoids a trap */ + kvm_mips_trans_cache_va(inst, opc, vcpu); +#endif + } else if (op == MIPS_CACHE_OP_HIT_INV && cache == MIPS_CACHE_ICACHE) { + flush_dcache_line(va); + flush_icache_line(va); + +#ifdef CONFIG_KVM_MIPS_DYN_TRANS + /* Replace the CACHE instruction, with a SYNCI */ + kvm_mips_trans_cache_va(inst, opc, vcpu); +#endif + } else { + printk + ("NO-OP CACHE (cache: %#x, op: %#x, base[%d]: %#lx, offset: %#x\n", + cache, op, base, arch->gprs[base], offset); + er = EMULATE_FAIL; + preempt_enable(); + goto dont_update_pc; + } + + preempt_enable(); + + dont_update_pc: + /* + * Rollback PC + */ + vcpu->arch.pc = curr_pc; + done: + return er; +} + +enum emulation_result +kvm_mips_emulate_inst(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + enum emulation_result er = EMULATE_DONE; + uint32_t inst; + + /* + * Fetch the instruction. + */ + if (cause & CAUSEF_BD) { + opc += 1; + } + + inst = kvm_get_inst(opc, vcpu); + + switch (((union mips_instruction)inst).r_format.opcode) { + case cop0_op: + er = kvm_mips_emulate_CP0(inst, opc, cause, run, vcpu); + break; + case sb_op: + case sh_op: + case sw_op: + er = kvm_mips_emulate_store(inst, cause, run, vcpu); + break; + case lb_op: + case lbu_op: + case lhu_op: + case lh_op: + case lw_op: + er = kvm_mips_emulate_load(inst, cause, run, vcpu); + break; + + case cache_op: + ++vcpu->stat.cache_exits; + trace_kvm_exit(vcpu, CACHE_EXITS); + er = kvm_mips_emulate_cache(inst, opc, cause, run, vcpu); + break; + + default: + printk("Instruction emulation not supported (%p/%#x)\n", opc, + inst); + kvm_arch_vcpu_dump_regs(vcpu); + er = EMULATE_FAIL; + break; + } + + return er; +} + +enum emulation_result +kvm_mips_emulate_syscall(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("Delivering SYSCALL @ pc %#lx\n", arch->pc); + + kvm_change_c0_guest_cause(cop0, (0xff), + (T_SYSCALL << CAUSEB_EXCCODE)); + + /* Set PC to the exception entry point */ + arch->pc = KVM_GUEST_KSEG0 + 0x180; + + } else { + printk("Trying to deliver SYSCALL when EXL is already set\n"); + er = EMULATE_FAIL; + } + + return er; +} + +enum emulation_result +kvm_mips_emulate_tlbmiss_ld(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + unsigned long entryhi = (vcpu->arch. host_cp0_badvaddr & VPN2_MASK) | + ASID_MASK(kvm_read_c0_guest_entryhi(cop0)); + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("[EXL == 0] delivering TLB MISS @ pc %#lx\n", + arch->pc); + + /* set pc to the exception entry point */ + arch->pc = KVM_GUEST_KSEG0 + 0x0; + + } else { + kvm_debug("[EXL == 1] delivering TLB MISS @ pc %#lx\n", + arch->pc); + + arch->pc = KVM_GUEST_KSEG0 + 0x180; + } + + kvm_change_c0_guest_cause(cop0, (0xff), + (T_TLB_LD_MISS << CAUSEB_EXCCODE)); + + /* setup badvaddr, context and entryhi registers for the guest */ + kvm_write_c0_guest_badvaddr(cop0, vcpu->arch.host_cp0_badvaddr); + /* XXXKYMA: is the context register used by linux??? */ + kvm_write_c0_guest_entryhi(cop0, entryhi); + /* Blow away the shadow host TLBs */ + kvm_mips_flush_host_tlb(1); + + return er; +} + +enum emulation_result +kvm_mips_emulate_tlbinv_ld(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + unsigned long entryhi = + (vcpu->arch.host_cp0_badvaddr & VPN2_MASK) | + ASID_MASK(kvm_read_c0_guest_entryhi(cop0)); + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("[EXL == 0] delivering TLB INV @ pc %#lx\n", + arch->pc); + + /* set pc to the exception entry point */ + arch->pc = KVM_GUEST_KSEG0 + 0x180; + + } else { + kvm_debug("[EXL == 1] delivering TLB MISS @ pc %#lx\n", + arch->pc); + arch->pc = KVM_GUEST_KSEG0 + 0x180; + } + + kvm_change_c0_guest_cause(cop0, (0xff), + (T_TLB_LD_MISS << CAUSEB_EXCCODE)); + + /* setup badvaddr, context and entryhi registers for the guest */ + kvm_write_c0_guest_badvaddr(cop0, vcpu->arch.host_cp0_badvaddr); + /* XXXKYMA: is the context register used by linux??? */ + kvm_write_c0_guest_entryhi(cop0, entryhi); + /* Blow away the shadow host TLBs */ + kvm_mips_flush_host_tlb(1); + + return er; +} + +enum emulation_result +kvm_mips_emulate_tlbmiss_st(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + unsigned long entryhi = (vcpu->arch.host_cp0_badvaddr & VPN2_MASK) | + ASID_MASK(kvm_read_c0_guest_entryhi(cop0)); + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("[EXL == 0] Delivering TLB MISS @ pc %#lx\n", + arch->pc); + + /* Set PC to the exception entry point */ + arch->pc = KVM_GUEST_KSEG0 + 0x0; + } else { + kvm_debug("[EXL == 1] Delivering TLB MISS @ pc %#lx\n", + arch->pc); + arch->pc = KVM_GUEST_KSEG0 + 0x180; + } + + kvm_change_c0_guest_cause(cop0, (0xff), + (T_TLB_ST_MISS << CAUSEB_EXCCODE)); + + /* setup badvaddr, context and entryhi registers for the guest */ + kvm_write_c0_guest_badvaddr(cop0, vcpu->arch.host_cp0_badvaddr); + /* XXXKYMA: is the context register used by linux??? */ + kvm_write_c0_guest_entryhi(cop0, entryhi); + /* Blow away the shadow host TLBs */ + kvm_mips_flush_host_tlb(1); + + return er; +} + +enum emulation_result +kvm_mips_emulate_tlbinv_st(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + unsigned long entryhi = (vcpu->arch.host_cp0_badvaddr & VPN2_MASK) | + ASID_MASK(kvm_read_c0_guest_entryhi(cop0)); + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("[EXL == 0] Delivering TLB MISS @ pc %#lx\n", + arch->pc); + + /* Set PC to the exception entry point */ + arch->pc = KVM_GUEST_KSEG0 + 0x180; + } else { + kvm_debug("[EXL == 1] Delivering TLB MISS @ pc %#lx\n", + arch->pc); + arch->pc = KVM_GUEST_KSEG0 + 0x180; + } + + kvm_change_c0_guest_cause(cop0, (0xff), + (T_TLB_ST_MISS << CAUSEB_EXCCODE)); + + /* setup badvaddr, context and entryhi registers for the guest */ + kvm_write_c0_guest_badvaddr(cop0, vcpu->arch.host_cp0_badvaddr); + /* XXXKYMA: is the context register used by linux??? */ + kvm_write_c0_guest_entryhi(cop0, entryhi); + /* Blow away the shadow host TLBs */ + kvm_mips_flush_host_tlb(1); + + return er; +} + +/* TLBMOD: store into address matching TLB with Dirty bit off */ +enum emulation_result +kvm_mips_handle_tlbmod(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + enum emulation_result er = EMULATE_DONE; + +#ifdef DEBUG + /* + * If address not in the guest TLB, then we are in trouble + */ + index = kvm_mips_guest_tlb_lookup(vcpu, entryhi); + if (index < 0) { + /* XXXKYMA Invalidate and retry */ + kvm_mips_host_tlb_inv(vcpu, vcpu->arch.host_cp0_badvaddr); + kvm_err("%s: host got TLBMOD for %#lx but entry not present in Guest TLB\n", + __func__, entryhi); + kvm_mips_dump_guest_tlbs(vcpu); + kvm_mips_dump_host_tlbs(); + return EMULATE_FAIL; + } +#endif + + er = kvm_mips_emulate_tlbmod(cause, opc, run, vcpu); + return er; +} + +enum emulation_result +kvm_mips_emulate_tlbmod(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + unsigned long entryhi = (vcpu->arch.host_cp0_badvaddr & VPN2_MASK) | + ASID_MASK(kvm_read_c0_guest_entryhi(cop0)); + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("[EXL == 0] Delivering TLB MOD @ pc %#lx\n", + arch->pc); + + arch->pc = KVM_GUEST_KSEG0 + 0x180; + } else { + kvm_debug("[EXL == 1] Delivering TLB MOD @ pc %#lx\n", + arch->pc); + arch->pc = KVM_GUEST_KSEG0 + 0x180; + } + + kvm_change_c0_guest_cause(cop0, (0xff), (T_TLB_MOD << CAUSEB_EXCCODE)); + + /* setup badvaddr, context and entryhi registers for the guest */ + kvm_write_c0_guest_badvaddr(cop0, vcpu->arch.host_cp0_badvaddr); + /* XXXKYMA: is the context register used by linux??? */ + kvm_write_c0_guest_entryhi(cop0, entryhi); + /* Blow away the shadow host TLBs */ + kvm_mips_flush_host_tlb(1); + + return er; +} + +enum emulation_result +kvm_mips_emulate_fpu_exc(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + } + + arch->pc = KVM_GUEST_KSEG0 + 0x180; + + kvm_change_c0_guest_cause(cop0, (0xff), + (T_COP_UNUSABLE << CAUSEB_EXCCODE)); + kvm_change_c0_guest_cause(cop0, (CAUSEF_CE), (0x1 << CAUSEB_CE)); + + return er; +} + +enum emulation_result +kvm_mips_emulate_ri_exc(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("Delivering RI @ pc %#lx\n", arch->pc); + + kvm_change_c0_guest_cause(cop0, (0xff), + (T_RES_INST << CAUSEB_EXCCODE)); + + /* Set PC to the exception entry point */ + arch->pc = KVM_GUEST_KSEG0 + 0x180; + + } else { + kvm_err("Trying to deliver RI when EXL is already set\n"); + er = EMULATE_FAIL; + } + + return er; +} + +enum emulation_result +kvm_mips_emulate_bp_exc(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("Delivering BP @ pc %#lx\n", arch->pc); + + kvm_change_c0_guest_cause(cop0, (0xff), + (T_BREAK << CAUSEB_EXCCODE)); + + /* Set PC to the exception entry point */ + arch->pc = KVM_GUEST_KSEG0 + 0x180; + + } else { + printk("Trying to deliver BP when EXL is already set\n"); + er = EMULATE_FAIL; + } + + return er; +} + +/* + * ll/sc, rdhwr, sync emulation + */ + +#define OPCODE 0xfc000000 +#define BASE 0x03e00000 +#define RT 0x001f0000 +#define OFFSET 0x0000ffff +#define LL 0xc0000000 +#define SC 0xe0000000 +#define SPEC0 0x00000000 +#define SPEC3 0x7c000000 +#define RD 0x0000f800 +#define FUNC 0x0000003f +#define SYNC 0x0000000f +#define RDHWR 0x0000003b + +enum emulation_result +kvm_mips_handle_ri(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + unsigned long curr_pc; + uint32_t inst; + + /* + * Update PC and hold onto current PC in case there is + * an error and we want to rollback the PC + */ + curr_pc = vcpu->arch.pc; + er = update_pc(vcpu, cause); + if (er == EMULATE_FAIL) + return er; + + /* + * Fetch the instruction. + */ + if (cause & CAUSEF_BD) + opc += 1; + + inst = kvm_get_inst(opc, vcpu); + + if (inst == KVM_INVALID_INST) { + printk("%s: Cannot get inst @ %p\n", __func__, opc); + return EMULATE_FAIL; + } + + if ((inst & OPCODE) == SPEC3 && (inst & FUNC) == RDHWR) { + int rd = (inst & RD) >> 11; + int rt = (inst & RT) >> 16; + switch (rd) { + case 0: /* CPU number */ + arch->gprs[rt] = 0; + break; + case 1: /* SYNCI length */ + arch->gprs[rt] = min(current_cpu_data.dcache.linesz, + current_cpu_data.icache.linesz); + break; + case 2: /* Read count register */ + printk("RDHWR: Cont register\n"); + arch->gprs[rt] = kvm_read_c0_guest_count(cop0); + break; + case 3: /* Count register resolution */ + switch (current_cpu_data.cputype) { + case CPU_20KC: + case CPU_25KF: + arch->gprs[rt] = 1; + break; + default: + arch->gprs[rt] = 2; + } + break; + case 29: +#if 1 + arch->gprs[rt] = kvm_read_c0_guest_userlocal(cop0); +#else + /* UserLocal not implemented */ + er = kvm_mips_emulate_ri_exc(cause, opc, run, vcpu); +#endif + break; + + default: + printk("RDHWR not supported\n"); + er = EMULATE_FAIL; + break; + } + } else { + printk("Emulate RI not supported @ %p: %#x\n", opc, inst); + er = EMULATE_FAIL; + } + + /* + * Rollback PC only if emulation was unsuccessful + */ + if (er == EMULATE_FAIL) { + vcpu->arch.pc = curr_pc; + } + return er; +} + +enum emulation_result +kvm_mips_complete_mmio_load(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ + unsigned long *gpr = &vcpu->arch.gprs[vcpu->arch.io_gpr]; + enum emulation_result er = EMULATE_DONE; + unsigned long curr_pc; + + if (run->mmio.len > sizeof(*gpr)) { + printk("Bad MMIO length: %d", run->mmio.len); + er = EMULATE_FAIL; + goto done; + } + + /* + * Update PC and hold onto current PC in case there is + * an error and we want to rollback the PC + */ + curr_pc = vcpu->arch.pc; + er = update_pc(vcpu, vcpu->arch.pending_load_cause); + if (er == EMULATE_FAIL) + return er; + + switch (run->mmio.len) { + case 4: + *gpr = *(int32_t *) run->mmio.data; + break; + + case 2: + if (vcpu->mmio_needed == 2) + *gpr = *(int16_t *) run->mmio.data; + else + *gpr = *(int16_t *) run->mmio.data; + + break; + case 1: + if (vcpu->mmio_needed == 2) + *gpr = *(int8_t *) run->mmio.data; + else + *gpr = *(u8 *) run->mmio.data; + break; + } + + if (vcpu->arch.pending_load_cause & CAUSEF_BD) + kvm_debug + ("[%#lx] Completing %d byte BD Load to gpr %d (0x%08lx) type %d\n", + vcpu->arch.pc, run->mmio.len, vcpu->arch.io_gpr, *gpr, + vcpu->mmio_needed); + +done: + return er; +} + +static enum emulation_result +kvm_mips_emulate_exc(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + uint32_t exccode = (cause >> CAUSEB_EXCCODE) & 0x1f; + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_vcpu_arch *arch = &vcpu->arch; + enum emulation_result er = EMULATE_DONE; + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_change_c0_guest_cause(cop0, (0xff), + (exccode << CAUSEB_EXCCODE)); + + /* Set PC to the exception entry point */ + arch->pc = KVM_GUEST_KSEG0 + 0x180; + kvm_write_c0_guest_badvaddr(cop0, vcpu->arch.host_cp0_badvaddr); + + kvm_debug("Delivering EXC %d @ pc %#lx, badVaddr: %#lx\n", + exccode, kvm_read_c0_guest_epc(cop0), + kvm_read_c0_guest_badvaddr(cop0)); + } else { + printk("Trying to deliver EXC when EXL is already set\n"); + er = EMULATE_FAIL; + } + + return er; +} + +enum emulation_result +kvm_mips_check_privilege(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + enum emulation_result er = EMULATE_DONE; + uint32_t exccode = (cause >> CAUSEB_EXCCODE) & 0x1f; + unsigned long badvaddr = vcpu->arch.host_cp0_badvaddr; + + int usermode = !KVM_GUEST_KERNEL_MODE(vcpu); + + if (usermode) { + switch (exccode) { + case T_INT: + case T_SYSCALL: + case T_BREAK: + case T_RES_INST: + break; + + case T_COP_UNUSABLE: + if (((cause & CAUSEF_CE) >> CAUSEB_CE) == 0) + er = EMULATE_PRIV_FAIL; + break; + + case T_TLB_MOD: + break; + + case T_TLB_LD_MISS: + /* We we are accessing Guest kernel space, then send an address error exception to the guest */ + if (badvaddr >= (unsigned long) KVM_GUEST_KSEG0) { + printk("%s: LD MISS @ %#lx\n", __func__, + badvaddr); + cause &= ~0xff; + cause |= (T_ADDR_ERR_LD << CAUSEB_EXCCODE); + er = EMULATE_PRIV_FAIL; + } + break; + + case T_TLB_ST_MISS: + /* We we are accessing Guest kernel space, then send an address error exception to the guest */ + if (badvaddr >= (unsigned long) KVM_GUEST_KSEG0) { + printk("%s: ST MISS @ %#lx\n", __func__, + badvaddr); + cause &= ~0xff; + cause |= (T_ADDR_ERR_ST << CAUSEB_EXCCODE); + er = EMULATE_PRIV_FAIL; + } + break; + + case T_ADDR_ERR_ST: + printk("%s: address error ST @ %#lx\n", __func__, + badvaddr); + if ((badvaddr & PAGE_MASK) == KVM_GUEST_COMMPAGE_ADDR) { + cause &= ~0xff; + cause |= (T_TLB_ST_MISS << CAUSEB_EXCCODE); + } + er = EMULATE_PRIV_FAIL; + break; + case T_ADDR_ERR_LD: + printk("%s: address error LD @ %#lx\n", __func__, + badvaddr); + if ((badvaddr & PAGE_MASK) == KVM_GUEST_COMMPAGE_ADDR) { + cause &= ~0xff; + cause |= (T_TLB_LD_MISS << CAUSEB_EXCCODE); + } + er = EMULATE_PRIV_FAIL; + break; + default: + er = EMULATE_PRIV_FAIL; + break; + } + } + + if (er == EMULATE_PRIV_FAIL) { + kvm_mips_emulate_exc(cause, opc, run, vcpu); + } + return er; +} + +/* User Address (UA) fault, this could happen if + * (1) TLB entry not present/valid in both Guest and shadow host TLBs, in this + * case we pass on the fault to the guest kernel and let it handle it. + * (2) TLB entry is present in the Guest TLB but not in the shadow, in this + * case we inject the TLB from the Guest TLB into the shadow host TLB + */ +enum emulation_result +kvm_mips_handle_tlbmiss(unsigned long cause, uint32_t *opc, + struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + enum emulation_result er = EMULATE_DONE; + uint32_t exccode = (cause >> CAUSEB_EXCCODE) & 0x1f; + unsigned long va = vcpu->arch.host_cp0_badvaddr; + int index; + + kvm_debug("kvm_mips_handle_tlbmiss: badvaddr: %#lx, entryhi: %#lx\n", + vcpu->arch.host_cp0_badvaddr, vcpu->arch.host_cp0_entryhi); + + /* KVM would not have got the exception if this entry was valid in the shadow host TLB + * Check the Guest TLB, if the entry is not there then send the guest an + * exception. The guest exc handler should then inject an entry into the + * guest TLB + */ + index = kvm_mips_guest_tlb_lookup(vcpu, + (va & VPN2_MASK) | + ASID_MASK(kvm_read_c0_guest_entryhi + (vcpu->arch.cop0))); + if (index < 0) { + if (exccode == T_TLB_LD_MISS) { + er = kvm_mips_emulate_tlbmiss_ld(cause, opc, run, vcpu); + } else if (exccode == T_TLB_ST_MISS) { + er = kvm_mips_emulate_tlbmiss_st(cause, opc, run, vcpu); + } else { + printk("%s: invalid exc code: %d\n", __func__, exccode); + er = EMULATE_FAIL; + } + } else { + struct kvm_mips_tlb *tlb = &vcpu->arch.guest_tlb[index]; + + /* Check if the entry is valid, if not then setup a TLB invalid exception to the guest */ + if (!TLB_IS_VALID(*tlb, va)) { + if (exccode == T_TLB_LD_MISS) { + er = kvm_mips_emulate_tlbinv_ld(cause, opc, run, + vcpu); + } else if (exccode == T_TLB_ST_MISS) { + er = kvm_mips_emulate_tlbinv_st(cause, opc, run, + vcpu); + } else { + printk("%s: invalid exc code: %d\n", __func__, + exccode); + er = EMULATE_FAIL; + } + } else { +#ifdef DEBUG + kvm_debug + ("Injecting hi: %#lx, lo0: %#lx, lo1: %#lx into shadow host TLB\n", + tlb->tlb_hi, tlb->tlb_lo0, tlb->tlb_lo1); +#endif + /* OK we have a Guest TLB entry, now inject it into the shadow host TLB */ + kvm_mips_handle_mapped_seg_tlb_fault(vcpu, tlb, NULL, + NULL); + } + } + + return er; +} diff --git a/arch/mips/kvm/kvm_mips_int.c b/arch/mips/kvm/kvm_mips_int.c new file mode 100644 index 000000000000..1e5de16afe29 --- /dev/null +++ b/arch/mips/kvm/kvm_mips_int.c @@ -0,0 +1,243 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* KVM/MIPS: Interrupt delivery +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "kvm_mips_int.h" + +void kvm_mips_queue_irq(struct kvm_vcpu *vcpu, uint32_t priority) +{ + set_bit(priority, &vcpu->arch.pending_exceptions); +} + +void kvm_mips_dequeue_irq(struct kvm_vcpu *vcpu, uint32_t priority) +{ + clear_bit(priority, &vcpu->arch.pending_exceptions); +} + +void kvm_mips_queue_timer_int_cb(struct kvm_vcpu *vcpu) +{ + /* Cause bits to reflect the pending timer interrupt, + * the EXC code will be set when we are actually + * delivering the interrupt: + */ + kvm_set_c0_guest_cause(vcpu->arch.cop0, (C_IRQ5 | C_TI)); + + /* Queue up an INT exception for the core */ + kvm_mips_queue_irq(vcpu, MIPS_EXC_INT_TIMER); + +} + +void kvm_mips_dequeue_timer_int_cb(struct kvm_vcpu *vcpu) +{ + kvm_clear_c0_guest_cause(vcpu->arch.cop0, (C_IRQ5 | C_TI)); + kvm_mips_dequeue_irq(vcpu, MIPS_EXC_INT_TIMER); +} + +void +kvm_mips_queue_io_int_cb(struct kvm_vcpu *vcpu, struct kvm_mips_interrupt *irq) +{ + int intr = (int)irq->irq; + + /* Cause bits to reflect the pending IO interrupt, + * the EXC code will be set when we are actually + * delivering the interrupt: + */ + switch (intr) { + case 2: + kvm_set_c0_guest_cause(vcpu->arch.cop0, (C_IRQ0)); + /* Queue up an INT exception for the core */ + kvm_mips_queue_irq(vcpu, MIPS_EXC_INT_IO); + break; + + case 3: + kvm_set_c0_guest_cause(vcpu->arch.cop0, (C_IRQ1)); + kvm_mips_queue_irq(vcpu, MIPS_EXC_INT_IPI_1); + break; + + case 4: + kvm_set_c0_guest_cause(vcpu->arch.cop0, (C_IRQ2)); + kvm_mips_queue_irq(vcpu, MIPS_EXC_INT_IPI_2); + break; + + default: + break; + } + +} + +void +kvm_mips_dequeue_io_int_cb(struct kvm_vcpu *vcpu, + struct kvm_mips_interrupt *irq) +{ + int intr = (int)irq->irq; + switch (intr) { + case -2: + kvm_clear_c0_guest_cause(vcpu->arch.cop0, (C_IRQ0)); + kvm_mips_dequeue_irq(vcpu, MIPS_EXC_INT_IO); + break; + + case -3: + kvm_clear_c0_guest_cause(vcpu->arch.cop0, (C_IRQ1)); + kvm_mips_dequeue_irq(vcpu, MIPS_EXC_INT_IPI_1); + break; + + case -4: + kvm_clear_c0_guest_cause(vcpu->arch.cop0, (C_IRQ2)); + kvm_mips_dequeue_irq(vcpu, MIPS_EXC_INT_IPI_2); + break; + + default: + break; + } + +} + +/* Deliver the interrupt of the corresponding priority, if possible. */ +int +kvm_mips_irq_deliver_cb(struct kvm_vcpu *vcpu, unsigned int priority, + uint32_t cause) +{ + int allowed = 0; + uint32_t exccode; + + struct kvm_vcpu_arch *arch = &vcpu->arch; + struct mips_coproc *cop0 = vcpu->arch.cop0; + + switch (priority) { + case MIPS_EXC_INT_TIMER: + if ((kvm_read_c0_guest_status(cop0) & ST0_IE) + && (!(kvm_read_c0_guest_status(cop0) & (ST0_EXL | ST0_ERL))) + && (kvm_read_c0_guest_status(cop0) & IE_IRQ5)) { + allowed = 1; + exccode = T_INT; + } + break; + + case MIPS_EXC_INT_IO: + if ((kvm_read_c0_guest_status(cop0) & ST0_IE) + && (!(kvm_read_c0_guest_status(cop0) & (ST0_EXL | ST0_ERL))) + && (kvm_read_c0_guest_status(cop0) & IE_IRQ0)) { + allowed = 1; + exccode = T_INT; + } + break; + + case MIPS_EXC_INT_IPI_1: + if ((kvm_read_c0_guest_status(cop0) & ST0_IE) + && (!(kvm_read_c0_guest_status(cop0) & (ST0_EXL | ST0_ERL))) + && (kvm_read_c0_guest_status(cop0) & IE_IRQ1)) { + allowed = 1; + exccode = T_INT; + } + break; + + case MIPS_EXC_INT_IPI_2: + if ((kvm_read_c0_guest_status(cop0) & ST0_IE) + && (!(kvm_read_c0_guest_status(cop0) & (ST0_EXL | ST0_ERL))) + && (kvm_read_c0_guest_status(cop0) & IE_IRQ2)) { + allowed = 1; + exccode = T_INT; + } + break; + + default: + break; + } + + /* Are we allowed to deliver the interrupt ??? */ + if (allowed) { + + if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) { + /* save old pc */ + kvm_write_c0_guest_epc(cop0, arch->pc); + kvm_set_c0_guest_status(cop0, ST0_EXL); + + if (cause & CAUSEF_BD) + kvm_set_c0_guest_cause(cop0, CAUSEF_BD); + else + kvm_clear_c0_guest_cause(cop0, CAUSEF_BD); + + kvm_debug("Delivering INT @ pc %#lx\n", arch->pc); + + } else + kvm_err("Trying to deliver interrupt when EXL is already set\n"); + + kvm_change_c0_guest_cause(cop0, CAUSEF_EXCCODE, + (exccode << CAUSEB_EXCCODE)); + + /* XXXSL Set PC to the interrupt exception entry point */ + if (kvm_read_c0_guest_cause(cop0) & CAUSEF_IV) + arch->pc = KVM_GUEST_KSEG0 + 0x200; + else + arch->pc = KVM_GUEST_KSEG0 + 0x180; + + clear_bit(priority, &vcpu->arch.pending_exceptions); + } + + return allowed; +} + +int +kvm_mips_irq_clear_cb(struct kvm_vcpu *vcpu, unsigned int priority, + uint32_t cause) +{ + return 1; +} + +void kvm_mips_deliver_interrupts(struct kvm_vcpu *vcpu, uint32_t cause) +{ + unsigned long *pending = &vcpu->arch.pending_exceptions; + unsigned long *pending_clr = &vcpu->arch.pending_exceptions_clr; + unsigned int priority; + + if (!(*pending) && !(*pending_clr)) + return; + + priority = __ffs(*pending_clr); + while (priority <= MIPS_EXC_MAX) { + if (kvm_mips_callbacks->irq_clear(vcpu, priority, cause)) { + if (!KVM_MIPS_IRQ_CLEAR_ALL_AT_ONCE) + break; + } + + priority = find_next_bit(pending_clr, + BITS_PER_BYTE * sizeof(*pending_clr), + priority + 1); + } + + priority = __ffs(*pending); + while (priority <= MIPS_EXC_MAX) { + if (kvm_mips_callbacks->irq_deliver(vcpu, priority, cause)) { + if (!KVM_MIPS_IRQ_DELIVER_ALL_AT_ONCE) + break; + } + + priority = find_next_bit(pending, + BITS_PER_BYTE * sizeof(*pending), + priority + 1); + } + +} + +int kvm_mips_pending_timer(struct kvm_vcpu *vcpu) +{ + return test_bit(MIPS_EXC_INT_TIMER, &vcpu->arch.pending_exceptions); +} diff --git a/arch/mips/kvm/kvm_mips_int.h b/arch/mips/kvm/kvm_mips_int.h new file mode 100644 index 000000000000..20da7d29eede --- /dev/null +++ b/arch/mips/kvm/kvm_mips_int.h @@ -0,0 +1,49 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* KVM/MIPS: Interrupts +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +/* MIPS Exception Priorities, exceptions (including interrupts) are queued up + * for the guest in the order specified by their priorities + */ + +#define MIPS_EXC_RESET 0 +#define MIPS_EXC_SRESET 1 +#define MIPS_EXC_DEBUG_ST 2 +#define MIPS_EXC_DEBUG 3 +#define MIPS_EXC_DDB 4 +#define MIPS_EXC_NMI 5 +#define MIPS_EXC_MCHK 6 +#define MIPS_EXC_INT_TIMER 7 +#define MIPS_EXC_INT_IO 8 +#define MIPS_EXC_EXECUTE 9 +#define MIPS_EXC_INT_IPI_1 10 +#define MIPS_EXC_INT_IPI_2 11 +#define MIPS_EXC_MAX 12 +/* XXXSL More to follow */ + +#define C_TI (_ULCAST_(1) << 30) + +#define KVM_MIPS_IRQ_DELIVER_ALL_AT_ONCE (0) +#define KVM_MIPS_IRQ_CLEAR_ALL_AT_ONCE (0) + +void kvm_mips_queue_irq(struct kvm_vcpu *vcpu, uint32_t priority); +void kvm_mips_dequeue_irq(struct kvm_vcpu *vcpu, uint32_t priority); +int kvm_mips_pending_timer(struct kvm_vcpu *vcpu); + +void kvm_mips_queue_timer_int_cb(struct kvm_vcpu *vcpu); +void kvm_mips_dequeue_timer_int_cb(struct kvm_vcpu *vcpu); +void kvm_mips_queue_io_int_cb(struct kvm_vcpu *vcpu, + struct kvm_mips_interrupt *irq); +void kvm_mips_dequeue_io_int_cb(struct kvm_vcpu *vcpu, + struct kvm_mips_interrupt *irq); +int kvm_mips_irq_deliver_cb(struct kvm_vcpu *vcpu, unsigned int priority, + uint32_t cause); +int kvm_mips_irq_clear_cb(struct kvm_vcpu *vcpu, unsigned int priority, + uint32_t cause); +void kvm_mips_deliver_interrupts(struct kvm_vcpu *vcpu, uint32_t cause); diff --git a/arch/mips/kvm/kvm_mips_opcode.h b/arch/mips/kvm/kvm_mips_opcode.h new file mode 100644 index 000000000000..86d3b4cc348b --- /dev/null +++ b/arch/mips/kvm/kvm_mips_opcode.h @@ -0,0 +1,24 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +/* + * Define opcode values not defined in + */ + +#ifndef __KVM_MIPS_OPCODE_H__ +#define __KVM_MIPS_OPCODE_H__ + +/* COP0 Ops */ +#define mfmcz_op 0x0b /* 01011 */ +#define wrpgpr_op 0x0e /* 01110 */ + +/* COP0 opcodes (only if COP0 and CO=1): */ +#define wait_op 0x20 /* 100000 */ + +#endif /* __KVM_MIPS_OPCODE_H__ */ diff --git a/arch/mips/kvm/kvm_mips_stats.c b/arch/mips/kvm/kvm_mips_stats.c new file mode 100644 index 000000000000..075904bcac1b --- /dev/null +++ b/arch/mips/kvm/kvm_mips_stats.c @@ -0,0 +1,82 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* KVM/MIPS: COP0 access histogram +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#include + +char *kvm_mips_exit_types_str[MAX_KVM_MIPS_EXIT_TYPES] = { + "WAIT", + "CACHE", + "Signal", + "Interrupt", + "COP0/1 Unusable", + "TLB Mod", + "TLB Miss (LD)", + "TLB Miss (ST)", + "Address Err (ST)", + "Address Error (LD)", + "System Call", + "Reserved Inst", + "Break Inst", + "D-Cache Flushes", +}; + +char *kvm_cop0_str[N_MIPS_COPROC_REGS] = { + "Index", + "Random", + "EntryLo0", + "EntryLo1", + "Context", + "PG Mask", + "Wired", + "HWREna", + "BadVAddr", + "Count", + "EntryHI", + "Compare", + "Status", + "Cause", + "EXC PC", + "PRID", + "Config", + "LLAddr", + "Watch Lo", + "Watch Hi", + "X Context", + "Reserved", + "Impl Dep", + "Debug", + "DEPC", + "PerfCnt", + "ErrCtl", + "CacheErr", + "TagLo", + "TagHi", + "ErrorEPC", + "DESAVE" +}; + +int kvm_mips_dump_stats(struct kvm_vcpu *vcpu) +{ +#ifdef CONFIG_KVM_MIPS_DEBUG_COP0_COUNTERS + int i, j; + + printk("\nKVM VCPU[%d] COP0 Access Profile:\n", vcpu->vcpu_id); + for (i = 0; i < N_MIPS_COPROC_REGS; i++) { + for (j = 0; j < N_MIPS_COPROC_SEL; j++) { + if (vcpu->arch.cop0->stat[i][j]) + printk("%s[%d]: %lu\n", kvm_cop0_str[i], j, + vcpu->arch.cop0->stat[i][j]); + } + } +#endif + + return 0; +} diff --git a/arch/mips/kvm/kvm_tlb.c b/arch/mips/kvm/kvm_tlb.c new file mode 100644 index 000000000000..89511a9258d3 --- /dev/null +++ b/arch/mips/kvm/kvm_tlb.c @@ -0,0 +1,928 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* KVM/MIPS TLB handling, this file is part of the Linux host kernel so that +* TLB handlers run from KSEG0 +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#undef CONFIG_MIPS_MT +#include +#define CONFIG_MIPS_MT + +#define KVM_GUEST_PC_TLB 0 +#define KVM_GUEST_SP_TLB 1 + +#define PRIx64 "llx" + +/* Use VZ EntryHi.EHINV to invalidate TLB entries */ +#define UNIQUE_ENTRYHI(idx) (CKSEG0 + ((idx) << (PAGE_SHIFT + 1))) + +atomic_t kvm_mips_instance; +EXPORT_SYMBOL(kvm_mips_instance); + +/* These function pointers are initialized once the KVM module is loaded */ +pfn_t(*kvm_mips_gfn_to_pfn) (struct kvm *kvm, gfn_t gfn); +EXPORT_SYMBOL(kvm_mips_gfn_to_pfn); + +void (*kvm_mips_release_pfn_clean) (pfn_t pfn); +EXPORT_SYMBOL(kvm_mips_release_pfn_clean); + +bool(*kvm_mips_is_error_pfn) (pfn_t pfn); +EXPORT_SYMBOL(kvm_mips_is_error_pfn); + +uint32_t kvm_mips_get_kernel_asid(struct kvm_vcpu *vcpu) +{ + return ASID_MASK(vcpu->arch.guest_kernel_asid[smp_processor_id()]); +} + + +uint32_t kvm_mips_get_user_asid(struct kvm_vcpu *vcpu) +{ + return ASID_MASK(vcpu->arch.guest_user_asid[smp_processor_id()]); +} + +inline uint32_t kvm_mips_get_commpage_asid (struct kvm_vcpu *vcpu) +{ + return vcpu->kvm->arch.commpage_tlb; +} + + +/* + * Structure defining an tlb entry data set. + */ + +void kvm_mips_dump_host_tlbs(void) +{ + unsigned long old_entryhi; + unsigned long old_pagemask; + struct kvm_mips_tlb tlb; + unsigned long flags; + int i; + + local_irq_save(flags); + + old_entryhi = read_c0_entryhi(); + old_pagemask = read_c0_pagemask(); + + printk("HOST TLBs:\n"); + printk("ASID: %#lx\n", ASID_MASK(read_c0_entryhi())); + + for (i = 0; i < current_cpu_data.tlbsize; i++) { + write_c0_index(i); + mtc0_tlbw_hazard(); + + tlb_read(); + tlbw_use_hazard(); + + tlb.tlb_hi = read_c0_entryhi(); + tlb.tlb_lo0 = read_c0_entrylo0(); + tlb.tlb_lo1 = read_c0_entrylo1(); + tlb.tlb_mask = read_c0_pagemask(); + + printk("TLB%c%3d Hi 0x%08lx ", + (tlb.tlb_lo0 | tlb.tlb_lo1) & MIPS3_PG_V ? ' ' : '*', + i, tlb.tlb_hi); + printk("Lo0=0x%09" PRIx64 " %c%c attr %lx ", + (uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo0), + (tlb.tlb_lo0 & MIPS3_PG_D) ? 'D' : ' ', + (tlb.tlb_lo0 & MIPS3_PG_G) ? 'G' : ' ', + (tlb.tlb_lo0 >> 3) & 7); + printk("Lo1=0x%09" PRIx64 " %c%c attr %lx sz=%lx\n", + (uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo1), + (tlb.tlb_lo1 & MIPS3_PG_D) ? 'D' : ' ', + (tlb.tlb_lo1 & MIPS3_PG_G) ? 'G' : ' ', + (tlb.tlb_lo1 >> 3) & 7, tlb.tlb_mask); + } + write_c0_entryhi(old_entryhi); + write_c0_pagemask(old_pagemask); + mtc0_tlbw_hazard(); + local_irq_restore(flags); +} + +void kvm_mips_dump_guest_tlbs(struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + struct kvm_mips_tlb tlb; + int i; + + printk("Guest TLBs:\n"); + printk("Guest EntryHi: %#lx\n", kvm_read_c0_guest_entryhi(cop0)); + + for (i = 0; i < KVM_MIPS_GUEST_TLB_SIZE; i++) { + tlb = vcpu->arch.guest_tlb[i]; + printk("TLB%c%3d Hi 0x%08lx ", + (tlb.tlb_lo0 | tlb.tlb_lo1) & MIPS3_PG_V ? ' ' : '*', + i, tlb.tlb_hi); + printk("Lo0=0x%09" PRIx64 " %c%c attr %lx ", + (uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo0), + (tlb.tlb_lo0 & MIPS3_PG_D) ? 'D' : ' ', + (tlb.tlb_lo0 & MIPS3_PG_G) ? 'G' : ' ', + (tlb.tlb_lo0 >> 3) & 7); + printk("Lo1=0x%09" PRIx64 " %c%c attr %lx sz=%lx\n", + (uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo1), + (tlb.tlb_lo1 & MIPS3_PG_D) ? 'D' : ' ', + (tlb.tlb_lo1 & MIPS3_PG_G) ? 'G' : ' ', + (tlb.tlb_lo1 >> 3) & 7, tlb.tlb_mask); + } +} + +void kvm_mips_dump_shadow_tlbs(struct kvm_vcpu *vcpu) +{ + int i; + volatile struct kvm_mips_tlb tlb; + + printk("Shadow TLBs:\n"); + for (i = 0; i < KVM_MIPS_GUEST_TLB_SIZE; i++) { + tlb = vcpu->arch.shadow_tlb[smp_processor_id()][i]; + printk("TLB%c%3d Hi 0x%08lx ", + (tlb.tlb_lo0 | tlb.tlb_lo1) & MIPS3_PG_V ? ' ' : '*', + i, tlb.tlb_hi); + printk("Lo0=0x%09" PRIx64 " %c%c attr %lx ", + (uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo0), + (tlb.tlb_lo0 & MIPS3_PG_D) ? 'D' : ' ', + (tlb.tlb_lo0 & MIPS3_PG_G) ? 'G' : ' ', + (tlb.tlb_lo0 >> 3) & 7); + printk("Lo1=0x%09" PRIx64 " %c%c attr %lx sz=%lx\n", + (uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo1), + (tlb.tlb_lo1 & MIPS3_PG_D) ? 'D' : ' ', + (tlb.tlb_lo1 & MIPS3_PG_G) ? 'G' : ' ', + (tlb.tlb_lo1 >> 3) & 7, tlb.tlb_mask); + } +} + +static void kvm_mips_map_page(struct kvm *kvm, gfn_t gfn) +{ + pfn_t pfn; + + if (kvm->arch.guest_pmap[gfn] != KVM_INVALID_PAGE) + return; + + pfn = kvm_mips_gfn_to_pfn(kvm, gfn); + + if (kvm_mips_is_error_pfn(pfn)) { + panic("Couldn't get pfn for gfn %#" PRIx64 "!\n", gfn); + } + + kvm->arch.guest_pmap[gfn] = pfn; + return; +} + +/* Translate guest KSEG0 addresses to Host PA */ +unsigned long kvm_mips_translate_guest_kseg0_to_hpa(struct kvm_vcpu *vcpu, + unsigned long gva) +{ + gfn_t gfn; + uint32_t offset = gva & ~PAGE_MASK; + struct kvm *kvm = vcpu->kvm; + + if (KVM_GUEST_KSEGX(gva) != KVM_GUEST_KSEG0) { + kvm_err("%s/%p: Invalid gva: %#lx\n", __func__, + __builtin_return_address(0), gva); + return KVM_INVALID_PAGE; + } + + gfn = (KVM_GUEST_CPHYSADDR(gva) >> PAGE_SHIFT); + + if (gfn >= kvm->arch.guest_pmap_npages) { + kvm_err("%s: Invalid gfn: %#llx, GVA: %#lx\n", __func__, gfn, + gva); + return KVM_INVALID_PAGE; + } + kvm_mips_map_page(vcpu->kvm, gfn); + return (kvm->arch.guest_pmap[gfn] << PAGE_SHIFT) + offset; +} + +/* XXXKYMA: Must be called with interrupts disabled */ +/* set flush_dcache_mask == 0 if no dcache flush required */ +int +kvm_mips_host_tlb_write(struct kvm_vcpu *vcpu, unsigned long entryhi, + unsigned long entrylo0, unsigned long entrylo1, int flush_dcache_mask) +{ + unsigned long flags; + unsigned long old_entryhi; + volatile int idx; + + local_irq_save(flags); + + + old_entryhi = read_c0_entryhi(); + write_c0_entryhi(entryhi); + mtc0_tlbw_hazard(); + + tlb_probe(); + tlb_probe_hazard(); + idx = read_c0_index(); + + if (idx > current_cpu_data.tlbsize) { + kvm_err("%s: Invalid Index: %d\n", __func__, idx); + kvm_mips_dump_host_tlbs(); + return -1; + } + + if (idx < 0) { + idx = read_c0_random() % current_cpu_data.tlbsize; + write_c0_index(idx); + mtc0_tlbw_hazard(); + } + write_c0_entrylo0(entrylo0); + write_c0_entrylo1(entrylo1); + mtc0_tlbw_hazard(); + + tlb_write_indexed(); + tlbw_use_hazard(); + +#ifdef DEBUG + if (debug) { + kvm_debug("@ %#lx idx: %2d [entryhi(R): %#lx] " + "entrylo0(R): 0x%08lx, entrylo1(R): 0x%08lx\n", + vcpu->arch.pc, idx, read_c0_entryhi(), + read_c0_entrylo0(), read_c0_entrylo1()); + } +#endif + + /* Flush D-cache */ + if (flush_dcache_mask) { + if (entrylo0 & MIPS3_PG_V) { + ++vcpu->stat.flush_dcache_exits; + flush_data_cache_page((entryhi & VPN2_MASK) & ~flush_dcache_mask); + } + if (entrylo1 & MIPS3_PG_V) { + ++vcpu->stat.flush_dcache_exits; + flush_data_cache_page(((entryhi & VPN2_MASK) & ~flush_dcache_mask) | + (0x1 << PAGE_SHIFT)); + } + } + + /* Restore old ASID */ + write_c0_entryhi(old_entryhi); + mtc0_tlbw_hazard(); + tlbw_use_hazard(); + local_irq_restore(flags); + return 0; +} + + +/* XXXKYMA: Must be called with interrupts disabled */ +int kvm_mips_handle_kseg0_tlb_fault(unsigned long badvaddr, + struct kvm_vcpu *vcpu) +{ + gfn_t gfn; + pfn_t pfn0, pfn1; + unsigned long vaddr = 0; + unsigned long entryhi = 0, entrylo0 = 0, entrylo1 = 0; + int even; + struct kvm *kvm = vcpu->kvm; + const int flush_dcache_mask = 0; + + + if (KVM_GUEST_KSEGX(badvaddr) != KVM_GUEST_KSEG0) { + kvm_err("%s: Invalid BadVaddr: %#lx\n", __func__, badvaddr); + kvm_mips_dump_host_tlbs(); + return -1; + } + + gfn = (KVM_GUEST_CPHYSADDR(badvaddr) >> PAGE_SHIFT); + if (gfn >= kvm->arch.guest_pmap_npages) { + kvm_err("%s: Invalid gfn: %#llx, BadVaddr: %#lx\n", __func__, + gfn, badvaddr); + kvm_mips_dump_host_tlbs(); + return -1; + } + even = !(gfn & 0x1); + vaddr = badvaddr & (PAGE_MASK << 1); + + kvm_mips_map_page(vcpu->kvm, gfn); + kvm_mips_map_page(vcpu->kvm, gfn ^ 0x1); + + if (even) { + pfn0 = kvm->arch.guest_pmap[gfn]; + pfn1 = kvm->arch.guest_pmap[gfn ^ 0x1]; + } else { + pfn0 = kvm->arch.guest_pmap[gfn ^ 0x1]; + pfn1 = kvm->arch.guest_pmap[gfn]; + } + + entryhi = (vaddr | kvm_mips_get_kernel_asid(vcpu)); + entrylo0 = mips3_paddr_to_tlbpfn(pfn0 << PAGE_SHIFT) | (0x3 << 3) | (1 << 2) | + (0x1 << 1); + entrylo1 = mips3_paddr_to_tlbpfn(pfn1 << PAGE_SHIFT) | (0x3 << 3) | (1 << 2) | + (0x1 << 1); + + return kvm_mips_host_tlb_write(vcpu, entryhi, entrylo0, entrylo1, + flush_dcache_mask); +} + +int kvm_mips_handle_commpage_tlb_fault(unsigned long badvaddr, + struct kvm_vcpu *vcpu) +{ + pfn_t pfn0, pfn1; + unsigned long flags, old_entryhi = 0, vaddr = 0; + unsigned long entrylo0 = 0, entrylo1 = 0; + + + pfn0 = CPHYSADDR(vcpu->arch.kseg0_commpage) >> PAGE_SHIFT; + pfn1 = 0; + entrylo0 = mips3_paddr_to_tlbpfn(pfn0 << PAGE_SHIFT) | (0x3 << 3) | (1 << 2) | + (0x1 << 1); + entrylo1 = 0; + + local_irq_save(flags); + + old_entryhi = read_c0_entryhi(); + vaddr = badvaddr & (PAGE_MASK << 1); + write_c0_entryhi(vaddr | kvm_mips_get_kernel_asid(vcpu)); + mtc0_tlbw_hazard(); + write_c0_entrylo0(entrylo0); + mtc0_tlbw_hazard(); + write_c0_entrylo1(entrylo1); + mtc0_tlbw_hazard(); + write_c0_index(kvm_mips_get_commpage_asid(vcpu)); + mtc0_tlbw_hazard(); + tlb_write_indexed(); + mtc0_tlbw_hazard(); + tlbw_use_hazard(); + +#ifdef DEBUG + kvm_debug ("@ %#lx idx: %2d [entryhi(R): %#lx] entrylo0 (R): 0x%08lx, entrylo1(R): 0x%08lx\n", + vcpu->arch.pc, read_c0_index(), read_c0_entryhi(), + read_c0_entrylo0(), read_c0_entrylo1()); +#endif + + /* Restore old ASID */ + write_c0_entryhi(old_entryhi); + mtc0_tlbw_hazard(); + tlbw_use_hazard(); + local_irq_restore(flags); + + return 0; +} + +int +kvm_mips_handle_mapped_seg_tlb_fault(struct kvm_vcpu *vcpu, + struct kvm_mips_tlb *tlb, unsigned long *hpa0, unsigned long *hpa1) +{ + unsigned long entryhi = 0, entrylo0 = 0, entrylo1 = 0; + struct kvm *kvm = vcpu->kvm; + pfn_t pfn0, pfn1; + + + if ((tlb->tlb_hi & VPN2_MASK) == 0) { + pfn0 = 0; + pfn1 = 0; + } else { + kvm_mips_map_page(kvm, mips3_tlbpfn_to_paddr(tlb->tlb_lo0) >> PAGE_SHIFT); + kvm_mips_map_page(kvm, mips3_tlbpfn_to_paddr(tlb->tlb_lo1) >> PAGE_SHIFT); + + pfn0 = kvm->arch.guest_pmap[mips3_tlbpfn_to_paddr(tlb->tlb_lo0) >> PAGE_SHIFT]; + pfn1 = kvm->arch.guest_pmap[mips3_tlbpfn_to_paddr(tlb->tlb_lo1) >> PAGE_SHIFT]; + } + + if (hpa0) + *hpa0 = pfn0 << PAGE_SHIFT; + + if (hpa1) + *hpa1 = pfn1 << PAGE_SHIFT; + + /* Get attributes from the Guest TLB */ + entryhi = (tlb->tlb_hi & VPN2_MASK) | (KVM_GUEST_KERNEL_MODE(vcpu) ? + kvm_mips_get_kernel_asid(vcpu) : kvm_mips_get_user_asid(vcpu)); + entrylo0 = mips3_paddr_to_tlbpfn(pfn0 << PAGE_SHIFT) | (0x3 << 3) | + (tlb->tlb_lo0 & MIPS3_PG_D) | (tlb->tlb_lo0 & MIPS3_PG_V); + entrylo1 = mips3_paddr_to_tlbpfn(pfn1 << PAGE_SHIFT) | (0x3 << 3) | + (tlb->tlb_lo1 & MIPS3_PG_D) | (tlb->tlb_lo1 & MIPS3_PG_V); + +#ifdef DEBUG + kvm_debug("@ %#lx tlb_lo0: 0x%08lx tlb_lo1: 0x%08lx\n", vcpu->arch.pc, + tlb->tlb_lo0, tlb->tlb_lo1); +#endif + + return kvm_mips_host_tlb_write(vcpu, entryhi, entrylo0, entrylo1, + tlb->tlb_mask); +} + +int kvm_mips_guest_tlb_lookup(struct kvm_vcpu *vcpu, unsigned long entryhi) +{ + int i; + int index = -1; + struct kvm_mips_tlb *tlb = vcpu->arch.guest_tlb; + + + for (i = 0; i < KVM_MIPS_GUEST_TLB_SIZE; i++) { + if (((TLB_VPN2(tlb[i]) & ~tlb[i].tlb_mask) == ((entryhi & VPN2_MASK) & ~tlb[i].tlb_mask)) && + (TLB_IS_GLOBAL(tlb[i]) || (TLB_ASID(tlb[i]) == ASID_MASK(entryhi)))) { + index = i; + break; + } + } + +#ifdef DEBUG + kvm_debug("%s: entryhi: %#lx, index: %d lo0: %#lx, lo1: %#lx\n", + __func__, entryhi, index, tlb[i].tlb_lo0, tlb[i].tlb_lo1); +#endif + + return index; +} + +int kvm_mips_host_tlb_lookup(struct kvm_vcpu *vcpu, unsigned long vaddr) +{ + unsigned long old_entryhi, flags; + volatile int idx; + + + local_irq_save(flags); + + old_entryhi = read_c0_entryhi(); + + if (KVM_GUEST_KERNEL_MODE(vcpu)) + write_c0_entryhi((vaddr & VPN2_MASK) | kvm_mips_get_kernel_asid(vcpu)); + else { + write_c0_entryhi((vaddr & VPN2_MASK) | kvm_mips_get_user_asid(vcpu)); + } + + mtc0_tlbw_hazard(); + + tlb_probe(); + tlb_probe_hazard(); + idx = read_c0_index(); + + /* Restore old ASID */ + write_c0_entryhi(old_entryhi); + mtc0_tlbw_hazard(); + tlbw_use_hazard(); + + local_irq_restore(flags); + +#ifdef DEBUG + kvm_debug("Host TLB lookup, %#lx, idx: %2d\n", vaddr, idx); +#endif + + return idx; +} + +int kvm_mips_host_tlb_inv(struct kvm_vcpu *vcpu, unsigned long va) +{ + int idx; + unsigned long flags, old_entryhi; + + local_irq_save(flags); + + + old_entryhi = read_c0_entryhi(); + + write_c0_entryhi((va & VPN2_MASK) | kvm_mips_get_user_asid(vcpu)); + mtc0_tlbw_hazard(); + + tlb_probe(); + tlb_probe_hazard(); + idx = read_c0_index(); + + if (idx >= current_cpu_data.tlbsize) + BUG(); + + if (idx > 0) { + write_c0_entryhi(UNIQUE_ENTRYHI(idx)); + mtc0_tlbw_hazard(); + + write_c0_entrylo0(0); + mtc0_tlbw_hazard(); + + write_c0_entrylo1(0); + mtc0_tlbw_hazard(); + + tlb_write_indexed(); + mtc0_tlbw_hazard(); + } + + write_c0_entryhi(old_entryhi); + mtc0_tlbw_hazard(); + tlbw_use_hazard(); + + local_irq_restore(flags); + +#ifdef DEBUG + if (idx > 0) { + kvm_debug("%s: Invalidated entryhi %#lx @ idx %d\n", __func__, + (va & VPN2_MASK) | (vcpu->arch.asid_map[va & ASID_MASK] & ASID_MASK), idx); + } +#endif + + return 0; +} + +/* XXXKYMA: Fix Guest USER/KERNEL no longer share the same ASID*/ +int kvm_mips_host_tlb_inv_index(struct kvm_vcpu *vcpu, int index) +{ + unsigned long flags, old_entryhi; + + if (index >= current_cpu_data.tlbsize) + BUG(); + + local_irq_save(flags); + + + old_entryhi = read_c0_entryhi(); + + write_c0_entryhi(UNIQUE_ENTRYHI(index)); + mtc0_tlbw_hazard(); + + write_c0_index(index); + mtc0_tlbw_hazard(); + + write_c0_entrylo0(0); + mtc0_tlbw_hazard(); + + write_c0_entrylo1(0); + mtc0_tlbw_hazard(); + + tlb_write_indexed(); + mtc0_tlbw_hazard(); + tlbw_use_hazard(); + + write_c0_entryhi(old_entryhi); + mtc0_tlbw_hazard(); + tlbw_use_hazard(); + + local_irq_restore(flags); + + return 0; +} + +void kvm_mips_flush_host_tlb(int skip_kseg0) +{ + unsigned long flags; + unsigned long old_entryhi, entryhi; + unsigned long old_pagemask; + int entry = 0; + int maxentry = current_cpu_data.tlbsize; + + + local_irq_save(flags); + + old_entryhi = read_c0_entryhi(); + old_pagemask = read_c0_pagemask(); + + /* Blast 'em all away. */ + for (entry = 0; entry < maxentry; entry++) { + + write_c0_index(entry); + mtc0_tlbw_hazard(); + + if (skip_kseg0) { + tlb_read(); + tlbw_use_hazard(); + + entryhi = read_c0_entryhi(); + + /* Don't blow away guest kernel entries */ + if (KVM_GUEST_KSEGX(entryhi) == KVM_GUEST_KSEG0) { + continue; + } + } + + /* Make sure all entries differ. */ + write_c0_entryhi(UNIQUE_ENTRYHI(entry)); + mtc0_tlbw_hazard(); + write_c0_entrylo0(0); + mtc0_tlbw_hazard(); + write_c0_entrylo1(0); + mtc0_tlbw_hazard(); + + tlb_write_indexed(); + mtc0_tlbw_hazard(); + } + + tlbw_use_hazard(); + + write_c0_entryhi(old_entryhi); + write_c0_pagemask(old_pagemask); + mtc0_tlbw_hazard(); + tlbw_use_hazard(); + + local_irq_restore(flags); +} + +void +kvm_get_new_mmu_context(struct mm_struct *mm, unsigned long cpu, + struct kvm_vcpu *vcpu) +{ + unsigned long asid = asid_cache(cpu); + + if (!(ASID_MASK(ASID_INC(asid)))) { + if (cpu_has_vtag_icache) { + flush_icache_all(); + } + + kvm_local_flush_tlb_all(); /* start new asid cycle */ + + if (!asid) /* fix version if needed */ + asid = ASID_FIRST_VERSION; + } + + cpu_context(cpu, mm) = asid_cache(cpu) = asid; +} + +void kvm_shadow_tlb_put(struct kvm_vcpu *vcpu) +{ + unsigned long flags; + unsigned long old_entryhi; + unsigned long old_pagemask; + int entry = 0; + int cpu = smp_processor_id(); + + local_irq_save(flags); + + old_entryhi = read_c0_entryhi(); + old_pagemask = read_c0_pagemask(); + + for (entry = 0; entry < current_cpu_data.tlbsize; entry++) { + write_c0_index(entry); + mtc0_tlbw_hazard(); + tlb_read(); + tlbw_use_hazard(); + + vcpu->arch.shadow_tlb[cpu][entry].tlb_hi = read_c0_entryhi(); + vcpu->arch.shadow_tlb[cpu][entry].tlb_lo0 = read_c0_entrylo0(); + vcpu->arch.shadow_tlb[cpu][entry].tlb_lo1 = read_c0_entrylo1(); + vcpu->arch.shadow_tlb[cpu][entry].tlb_mask = read_c0_pagemask(); + } + + write_c0_entryhi(old_entryhi); + write_c0_pagemask(old_pagemask); + mtc0_tlbw_hazard(); + + local_irq_restore(flags); + +} + +void kvm_shadow_tlb_load(struct kvm_vcpu *vcpu) +{ + unsigned long flags; + unsigned long old_ctx; + int entry; + int cpu = smp_processor_id(); + + local_irq_save(flags); + + old_ctx = read_c0_entryhi(); + + for (entry = 0; entry < current_cpu_data.tlbsize; entry++) { + write_c0_entryhi(vcpu->arch.shadow_tlb[cpu][entry].tlb_hi); + mtc0_tlbw_hazard(); + write_c0_entrylo0(vcpu->arch.shadow_tlb[cpu][entry].tlb_lo0); + write_c0_entrylo1(vcpu->arch.shadow_tlb[cpu][entry].tlb_lo1); + + write_c0_index(entry); + mtc0_tlbw_hazard(); + + tlb_write_indexed(); + tlbw_use_hazard(); + } + + tlbw_use_hazard(); + write_c0_entryhi(old_ctx); + mtc0_tlbw_hazard(); + local_irq_restore(flags); +} + + +void kvm_local_flush_tlb_all(void) +{ + unsigned long flags; + unsigned long old_ctx; + int entry = 0; + + local_irq_save(flags); + /* Save old context and create impossible VPN2 value */ + old_ctx = read_c0_entryhi(); + write_c0_entrylo0(0); + write_c0_entrylo1(0); + + /* Blast 'em all away. */ + while (entry < current_cpu_data.tlbsize) { + /* Make sure all entries differ. */ + write_c0_entryhi(UNIQUE_ENTRYHI(entry)); + write_c0_index(entry); + mtc0_tlbw_hazard(); + tlb_write_indexed(); + entry++; + } + tlbw_use_hazard(); + write_c0_entryhi(old_ctx); + mtc0_tlbw_hazard(); + + local_irq_restore(flags); +} + +void kvm_mips_init_shadow_tlb(struct kvm_vcpu *vcpu) +{ + int cpu, entry; + + for_each_possible_cpu(cpu) { + for (entry = 0; entry < current_cpu_data.tlbsize; entry++) { + vcpu->arch.shadow_tlb[cpu][entry].tlb_hi = + UNIQUE_ENTRYHI(entry); + vcpu->arch.shadow_tlb[cpu][entry].tlb_lo0 = 0x0; + vcpu->arch.shadow_tlb[cpu][entry].tlb_lo1 = 0x0; + vcpu->arch.shadow_tlb[cpu][entry].tlb_mask = + read_c0_pagemask(); +#ifdef DEBUG + kvm_debug + ("shadow_tlb[%d][%d]: tlb_hi: %#lx, lo0: %#lx, lo1: %#lx\n", + cpu, entry, + vcpu->arch.shadow_tlb[cpu][entry].tlb_hi, + vcpu->arch.shadow_tlb[cpu][entry].tlb_lo0, + vcpu->arch.shadow_tlb[cpu][entry].tlb_lo1); +#endif + } + } +} + +/* Restore ASID once we are scheduled back after preemption */ +void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) +{ + unsigned long flags; + int newasid = 0; + +#ifdef DEBUG + kvm_debug("%s: vcpu %p, cpu: %d\n", __func__, vcpu, cpu); +#endif + + /* Alocate new kernel and user ASIDs if needed */ + + local_irq_save(flags); + + if (((vcpu->arch. + guest_kernel_asid[cpu] ^ asid_cache(cpu)) & ASID_VERSION_MASK)) { + kvm_get_new_mmu_context(&vcpu->arch.guest_kernel_mm, cpu, vcpu); + vcpu->arch.guest_kernel_asid[cpu] = + vcpu->arch.guest_kernel_mm.context.asid[cpu]; + kvm_get_new_mmu_context(&vcpu->arch.guest_user_mm, cpu, vcpu); + vcpu->arch.guest_user_asid[cpu] = + vcpu->arch.guest_user_mm.context.asid[cpu]; + newasid++; + + kvm_info("[%d]: cpu_context: %#lx\n", cpu, + cpu_context(cpu, current->mm)); + kvm_info("[%d]: Allocated new ASID for Guest Kernel: %#x\n", + cpu, vcpu->arch.guest_kernel_asid[cpu]); + kvm_info("[%d]: Allocated new ASID for Guest User: %#x\n", cpu, + vcpu->arch.guest_user_asid[cpu]); + } + + if (vcpu->arch.last_sched_cpu != cpu) { + kvm_info("[%d->%d]KVM VCPU[%d] switch\n", + vcpu->arch.last_sched_cpu, cpu, vcpu->vcpu_id); + } + + /* Only reload shadow host TLB if new ASIDs haven't been allocated */ +#if 0 + if ((atomic_read(&kvm_mips_instance) > 1) && !newasid) { + kvm_mips_flush_host_tlb(0); + kvm_shadow_tlb_load(vcpu); + } +#endif + + if (!newasid) { + /* If we preempted while the guest was executing, then reload the pre-empted ASID */ + if (current->flags & PF_VCPU) { + write_c0_entryhi(ASID_MASK(vcpu->arch.preempt_entryhi)); + ehb(); + } + } else { + /* New ASIDs were allocated for the VM */ + + /* Were we in guest context? If so then the pre-empted ASID is no longer + * valid, we need to set it to what it should be based on the mode of + * the Guest (Kernel/User) + */ + if (current->flags & PF_VCPU) { + if (KVM_GUEST_KERNEL_MODE(vcpu)) + write_c0_entryhi(ASID_MASK(vcpu->arch. + guest_kernel_asid[cpu])); + else + write_c0_entryhi(ASID_MASK(vcpu->arch. + guest_user_asid[cpu])); + ehb(); + } + } + + local_irq_restore(flags); + +} + +/* ASID can change if another task is scheduled during preemption */ +void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) +{ + unsigned long flags; + uint32_t cpu; + + local_irq_save(flags); + + cpu = smp_processor_id(); + + + vcpu->arch.preempt_entryhi = read_c0_entryhi(); + vcpu->arch.last_sched_cpu = cpu; + +#if 0 + if ((atomic_read(&kvm_mips_instance) > 1)) { + kvm_shadow_tlb_put(vcpu); + } +#endif + + if (((cpu_context(cpu, current->mm) ^ asid_cache(cpu)) & + ASID_VERSION_MASK)) { + kvm_debug("%s: Dropping MMU Context: %#lx\n", __func__, + cpu_context(cpu, current->mm)); + drop_mmu_context(current->mm, cpu); + } + write_c0_entryhi(cpu_asid(cpu, current->mm)); + ehb(); + + local_irq_restore(flags); +} + +uint32_t kvm_get_inst(uint32_t *opc, struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + unsigned long paddr, flags; + uint32_t inst; + int index; + + if (KVM_GUEST_KSEGX((unsigned long) opc) < KVM_GUEST_KSEG0 || + KVM_GUEST_KSEGX((unsigned long) opc) == KVM_GUEST_KSEG23) { + local_irq_save(flags); + index = kvm_mips_host_tlb_lookup(vcpu, (unsigned long) opc); + if (index >= 0) { + inst = *(opc); + } else { + index = + kvm_mips_guest_tlb_lookup(vcpu, + ((unsigned long) opc & VPN2_MASK) + | + ASID_MASK(kvm_read_c0_guest_entryhi(cop0))); + if (index < 0) { + kvm_err + ("%s: get_user_failed for %p, vcpu: %p, ASID: %#lx\n", + __func__, opc, vcpu, read_c0_entryhi()); + kvm_mips_dump_host_tlbs(); + local_irq_restore(flags); + return KVM_INVALID_INST; + } + kvm_mips_handle_mapped_seg_tlb_fault(vcpu, + &vcpu->arch. + guest_tlb[index], + NULL, NULL); + inst = *(opc); + } + local_irq_restore(flags); + } else if (KVM_GUEST_KSEGX(opc) == KVM_GUEST_KSEG0) { + paddr = + kvm_mips_translate_guest_kseg0_to_hpa(vcpu, + (unsigned long) opc); + inst = *(uint32_t *) CKSEG0ADDR(paddr); + } else { + kvm_err("%s: illegal address: %p\n", __func__, opc); + return KVM_INVALID_INST; + } + + return inst; +} + +EXPORT_SYMBOL(kvm_local_flush_tlb_all); +EXPORT_SYMBOL(kvm_shadow_tlb_put); +EXPORT_SYMBOL(kvm_mips_handle_mapped_seg_tlb_fault); +EXPORT_SYMBOL(kvm_mips_handle_commpage_tlb_fault); +EXPORT_SYMBOL(kvm_mips_init_shadow_tlb); +EXPORT_SYMBOL(kvm_mips_dump_host_tlbs); +EXPORT_SYMBOL(kvm_mips_handle_kseg0_tlb_fault); +EXPORT_SYMBOL(kvm_mips_host_tlb_lookup); +EXPORT_SYMBOL(kvm_mips_flush_host_tlb); +EXPORT_SYMBOL(kvm_mips_guest_tlb_lookup); +EXPORT_SYMBOL(kvm_mips_host_tlb_inv); +EXPORT_SYMBOL(kvm_mips_translate_guest_kseg0_to_hpa); +EXPORT_SYMBOL(kvm_shadow_tlb_load); +EXPORT_SYMBOL(kvm_mips_dump_shadow_tlbs); +EXPORT_SYMBOL(kvm_mips_dump_guest_tlbs); +EXPORT_SYMBOL(kvm_get_inst); +EXPORT_SYMBOL(kvm_arch_vcpu_load); +EXPORT_SYMBOL(kvm_arch_vcpu_put); diff --git a/arch/mips/kvm/kvm_trap_emul.c b/arch/mips/kvm/kvm_trap_emul.c new file mode 100644 index 000000000000..466aeef044bd --- /dev/null +++ b/arch/mips/kvm/kvm_trap_emul.c @@ -0,0 +1,482 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* KVM/MIPS: Deliver/Emulate exceptions to the guest kernel +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#include +#include +#include +#include + +#include + +#include "kvm_mips_opcode.h" +#include "kvm_mips_int.h" + +static gpa_t kvm_trap_emul_gva_to_gpa_cb(gva_t gva) +{ + gpa_t gpa; + uint32_t kseg = KSEGX(gva); + + if ((kseg == CKSEG0) || (kseg == CKSEG1)) + gpa = CPHYSADDR(gva); + else { + printk("%s: cannot find GPA for GVA: %#lx\n", __func__, gva); + kvm_mips_dump_host_tlbs(); + gpa = KVM_INVALID_ADDR; + } + +#ifdef DEBUG + kvm_debug("%s: gva %#lx, gpa: %#llx\n", __func__, gva, gpa); +#endif + + return gpa; +} + + +static int kvm_trap_emul_handle_cop_unusable(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + if (((cause & CAUSEF_CE) >> CAUSEB_CE) == 1) { + er = kvm_mips_emulate_fpu_exc(cause, opc, run, vcpu); + } else + er = kvm_mips_emulate_inst(cause, opc, run, vcpu); + + switch (er) { + case EMULATE_DONE: + ret = RESUME_GUEST; + break; + + case EMULATE_FAIL: + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + break; + + case EMULATE_WAIT: + run->exit_reason = KVM_EXIT_INTR; + ret = RESUME_HOST; + break; + + default: + BUG(); + } + return ret; +} + +static int kvm_trap_emul_handle_tlb_mod(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long badvaddr = vcpu->arch.host_cp0_badvaddr; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + if (KVM_GUEST_KSEGX(badvaddr) < KVM_GUEST_KSEG0 + || KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG23) { +#ifdef DEBUG + kvm_debug + ("USER/KSEG23 ADDR TLB MOD fault: cause %#lx, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); +#endif + er = kvm_mips_handle_tlbmod(cause, opc, run, vcpu); + + if (er == EMULATE_DONE) + ret = RESUME_GUEST; + else { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + } else if (KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG0) { + /* XXXKYMA: The guest kernel does not expect to get this fault when we are not + * using HIGHMEM. Need to address this in a HIGHMEM kernel + */ + printk + ("TLB MOD fault not handled, cause %#lx, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); + kvm_mips_dump_host_tlbs(); + kvm_arch_vcpu_dump_regs(vcpu); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } else { + printk + ("Illegal TLB Mod fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); + kvm_mips_dump_host_tlbs(); + kvm_arch_vcpu_dump_regs(vcpu); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + return ret; +} + +static int kvm_trap_emul_handle_tlb_st_miss(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long badvaddr = vcpu->arch.host_cp0_badvaddr; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + if (((badvaddr & PAGE_MASK) == KVM_GUEST_COMMPAGE_ADDR) + && KVM_GUEST_KERNEL_MODE(vcpu)) { + if (kvm_mips_handle_commpage_tlb_fault(badvaddr, vcpu) < 0) { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + } else if (KVM_GUEST_KSEGX(badvaddr) < KVM_GUEST_KSEG0 + || KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG23) { +#ifdef DEBUG + kvm_debug + ("USER ADDR TLB LD fault: cause %#lx, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); +#endif + er = kvm_mips_handle_tlbmiss(cause, opc, run, vcpu); + if (er == EMULATE_DONE) + ret = RESUME_GUEST; + else { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + } else if (KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG0) { + /* All KSEG0 faults are handled by KVM, as the guest kernel does not + * expect to ever get them + */ + if (kvm_mips_handle_kseg0_tlb_fault + (vcpu->arch.host_cp0_badvaddr, vcpu) < 0) { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + } else { + kvm_err + ("Illegal TLB LD fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); + kvm_mips_dump_host_tlbs(); + kvm_arch_vcpu_dump_regs(vcpu); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + return ret; +} + +static int kvm_trap_emul_handle_tlb_ld_miss(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long badvaddr = vcpu->arch.host_cp0_badvaddr; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + if (((badvaddr & PAGE_MASK) == KVM_GUEST_COMMPAGE_ADDR) + && KVM_GUEST_KERNEL_MODE(vcpu)) { + if (kvm_mips_handle_commpage_tlb_fault(badvaddr, vcpu) < 0) { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + } else if (KVM_GUEST_KSEGX(badvaddr) < KVM_GUEST_KSEG0 + || KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG23) { +#ifdef DEBUG + kvm_debug("USER ADDR TLB ST fault: PC: %#lx, BadVaddr: %#lx\n", + vcpu->arch.pc, badvaddr); +#endif + + /* User Address (UA) fault, this could happen if + * (1) TLB entry not present/valid in both Guest and shadow host TLBs, in this + * case we pass on the fault to the guest kernel and let it handle it. + * (2) TLB entry is present in the Guest TLB but not in the shadow, in this + * case we inject the TLB from the Guest TLB into the shadow host TLB + */ + + er = kvm_mips_handle_tlbmiss(cause, opc, run, vcpu); + if (er == EMULATE_DONE) + ret = RESUME_GUEST; + else { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + } else if (KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG0) { + if (kvm_mips_handle_kseg0_tlb_fault + (vcpu->arch.host_cp0_badvaddr, vcpu) < 0) { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + } else { + printk + ("Illegal TLB ST fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); + kvm_mips_dump_host_tlbs(); + kvm_arch_vcpu_dump_regs(vcpu); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + return ret; +} + +static int kvm_trap_emul_handle_addr_err_st(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long badvaddr = vcpu->arch.host_cp0_badvaddr; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + if (KVM_GUEST_KERNEL_MODE(vcpu) + && (KSEGX(badvaddr) == CKSEG0 || KSEGX(badvaddr) == CKSEG1)) { +#ifdef DEBUG + kvm_debug("Emulate Store to MMIO space\n"); +#endif + er = kvm_mips_emulate_inst(cause, opc, run, vcpu); + if (er == EMULATE_FAIL) { + printk("Emulate Store to MMIO space failed\n"); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } else { + run->exit_reason = KVM_EXIT_MMIO; + ret = RESUME_HOST; + } + } else { + printk + ("Address Error (STORE): cause %#lx, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + return ret; +} + +static int kvm_trap_emul_handle_addr_err_ld(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long badvaddr = vcpu->arch.host_cp0_badvaddr; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + if (KSEGX(badvaddr) == CKSEG0 || KSEGX(badvaddr) == CKSEG1) { +#ifdef DEBUG + kvm_debug("Emulate Load from MMIO space @ %#lx\n", badvaddr); +#endif + er = kvm_mips_emulate_inst(cause, opc, run, vcpu); + if (er == EMULATE_FAIL) { + printk("Emulate Load from MMIO space failed\n"); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } else { + run->exit_reason = KVM_EXIT_MMIO; + ret = RESUME_HOST; + } + } else { + printk + ("Address Error (LOAD): cause %#lx, PC: %p, BadVaddr: %#lx\n", + cause, opc, badvaddr); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + er = EMULATE_FAIL; + } + return ret; +} + +static int kvm_trap_emul_handle_syscall(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + er = kvm_mips_emulate_syscall(cause, opc, run, vcpu); + if (er == EMULATE_DONE) + ret = RESUME_GUEST; + else { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + return ret; +} + +static int kvm_trap_emul_handle_res_inst(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + er = kvm_mips_handle_ri(cause, opc, run, vcpu); + if (er == EMULATE_DONE) + ret = RESUME_GUEST; + else { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + return ret; +} + +static int kvm_trap_emul_handle_break(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + uint32_t __user *opc = (uint32_t __user *) vcpu->arch.pc; + unsigned long cause = vcpu->arch.host_cp0_cause; + enum emulation_result er = EMULATE_DONE; + int ret = RESUME_GUEST; + + er = kvm_mips_emulate_bp_exc(cause, opc, run, vcpu); + if (er == EMULATE_DONE) + ret = RESUME_GUEST; + else { + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + ret = RESUME_HOST; + } + return ret; +} + +static int +kvm_trap_emul_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + + kvm_write_c0_guest_index(cop0, regs->cp0reg[MIPS_CP0_TLB_INDEX][0]); + kvm_write_c0_guest_context(cop0, regs->cp0reg[MIPS_CP0_TLB_CONTEXT][0]); + kvm_write_c0_guest_badvaddr(cop0, regs->cp0reg[MIPS_CP0_BAD_VADDR][0]); + kvm_write_c0_guest_entryhi(cop0, regs->cp0reg[MIPS_CP0_TLB_HI][0]); + kvm_write_c0_guest_epc(cop0, regs->cp0reg[MIPS_CP0_EXC_PC][0]); + + kvm_write_c0_guest_status(cop0, regs->cp0reg[MIPS_CP0_STATUS][0]); + kvm_write_c0_guest_cause(cop0, regs->cp0reg[MIPS_CP0_CAUSE][0]); + kvm_write_c0_guest_pagemask(cop0, + regs->cp0reg[MIPS_CP0_TLB_PG_MASK][0]); + kvm_write_c0_guest_wired(cop0, regs->cp0reg[MIPS_CP0_TLB_WIRED][0]); + kvm_write_c0_guest_errorepc(cop0, regs->cp0reg[MIPS_CP0_ERROR_PC][0]); + + return 0; +} + +static int +kvm_trap_emul_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + + regs->cp0reg[MIPS_CP0_TLB_INDEX][0] = kvm_read_c0_guest_index(cop0); + regs->cp0reg[MIPS_CP0_TLB_CONTEXT][0] = kvm_read_c0_guest_context(cop0); + regs->cp0reg[MIPS_CP0_BAD_VADDR][0] = kvm_read_c0_guest_badvaddr(cop0); + regs->cp0reg[MIPS_CP0_TLB_HI][0] = kvm_read_c0_guest_entryhi(cop0); + regs->cp0reg[MIPS_CP0_EXC_PC][0] = kvm_read_c0_guest_epc(cop0); + + regs->cp0reg[MIPS_CP0_STATUS][0] = kvm_read_c0_guest_status(cop0); + regs->cp0reg[MIPS_CP0_CAUSE][0] = kvm_read_c0_guest_cause(cop0); + regs->cp0reg[MIPS_CP0_TLB_PG_MASK][0] = + kvm_read_c0_guest_pagemask(cop0); + regs->cp0reg[MIPS_CP0_TLB_WIRED][0] = kvm_read_c0_guest_wired(cop0); + regs->cp0reg[MIPS_CP0_ERROR_PC][0] = kvm_read_c0_guest_errorepc(cop0); + + regs->cp0reg[MIPS_CP0_CONFIG][0] = kvm_read_c0_guest_config(cop0); + regs->cp0reg[MIPS_CP0_CONFIG][1] = kvm_read_c0_guest_config1(cop0); + regs->cp0reg[MIPS_CP0_CONFIG][2] = kvm_read_c0_guest_config2(cop0); + regs->cp0reg[MIPS_CP0_CONFIG][3] = kvm_read_c0_guest_config3(cop0); + regs->cp0reg[MIPS_CP0_CONFIG][7] = kvm_read_c0_guest_config7(cop0); + + return 0; +} + +static int kvm_trap_emul_vm_init(struct kvm *kvm) +{ + return 0; +} + +static int kvm_trap_emul_vcpu_init(struct kvm_vcpu *vcpu) +{ + return 0; +} + +static int kvm_trap_emul_vcpu_setup(struct kvm_vcpu *vcpu) +{ + struct mips_coproc *cop0 = vcpu->arch.cop0; + uint32_t config1; + int vcpu_id = vcpu->vcpu_id; + + /* Arch specific stuff, set up config registers properly so that the + * guest will come up as expected, for now we simulate a + * MIPS 24kc + */ + kvm_write_c0_guest_prid(cop0, 0x00019300); + kvm_write_c0_guest_config(cop0, + MIPS_CONFIG0 | (0x1 << CP0C0_AR) | + (MMU_TYPE_R4000 << CP0C0_MT)); + + /* Read the cache characteristics from the host Config1 Register */ + config1 = (read_c0_config1() & ~0x7f); + + /* Set up MMU size */ + config1 &= ~(0x3f << 25); + config1 |= ((KVM_MIPS_GUEST_TLB_SIZE - 1) << 25); + + /* We unset some bits that we aren't emulating */ + config1 &= + ~((1 << CP0C1_C2) | (1 << CP0C1_MD) | (1 << CP0C1_PC) | + (1 << CP0C1_WR) | (1 << CP0C1_CA)); + kvm_write_c0_guest_config1(cop0, config1); + + kvm_write_c0_guest_config2(cop0, MIPS_CONFIG2); + /* MIPS_CONFIG2 | (read_c0_config2() & 0xfff) */ + kvm_write_c0_guest_config3(cop0, + MIPS_CONFIG3 | (0 << CP0C3_VInt) | (1 << + CP0C3_ULRI)); + + /* Set Wait IE/IXMT Ignore in Config7, IAR, AR */ + kvm_write_c0_guest_config7(cop0, (MIPS_CONF7_WII) | (1 << 10)); + + /* Setup IntCtl defaults, compatibilty mode for timer interrupts (HW5) */ + kvm_write_c0_guest_intctl(cop0, 0xFC000000); + + /* Put in vcpu id as CPUNum into Ebase Reg to handle SMP Guests */ + kvm_write_c0_guest_ebase(cop0, KVM_GUEST_KSEG0 | (vcpu_id & 0xFF)); + + return 0; +} + +static struct kvm_mips_callbacks kvm_trap_emul_callbacks = { + /* exit handlers */ + .handle_cop_unusable = kvm_trap_emul_handle_cop_unusable, + .handle_tlb_mod = kvm_trap_emul_handle_tlb_mod, + .handle_tlb_st_miss = kvm_trap_emul_handle_tlb_st_miss, + .handle_tlb_ld_miss = kvm_trap_emul_handle_tlb_ld_miss, + .handle_addr_err_st = kvm_trap_emul_handle_addr_err_st, + .handle_addr_err_ld = kvm_trap_emul_handle_addr_err_ld, + .handle_syscall = kvm_trap_emul_handle_syscall, + .handle_res_inst = kvm_trap_emul_handle_res_inst, + .handle_break = kvm_trap_emul_handle_break, + + .vm_init = kvm_trap_emul_vm_init, + .vcpu_init = kvm_trap_emul_vcpu_init, + .vcpu_setup = kvm_trap_emul_vcpu_setup, + .gva_to_gpa = kvm_trap_emul_gva_to_gpa_cb, + .queue_timer_int = kvm_mips_queue_timer_int_cb, + .dequeue_timer_int = kvm_mips_dequeue_timer_int_cb, + .queue_io_int = kvm_mips_queue_io_int_cb, + .dequeue_io_int = kvm_mips_dequeue_io_int_cb, + .irq_deliver = kvm_mips_irq_deliver_cb, + .irq_clear = kvm_mips_irq_clear_cb, + .vcpu_ioctl_get_regs = kvm_trap_emul_ioctl_get_regs, + .vcpu_ioctl_set_regs = kvm_trap_emul_ioctl_set_regs, +}; + +int kvm_mips_emulation_init(struct kvm_mips_callbacks **install_callbacks) +{ + *install_callbacks = &kvm_trap_emul_callbacks; + return 0; +} diff --git a/arch/mips/kvm/trace.h b/arch/mips/kvm/trace.h new file mode 100644 index 000000000000..bc9e0f406c08 --- /dev/null +++ b/arch/mips/kvm/trace.h @@ -0,0 +1,46 @@ +/* +* This file is subject to the terms and conditions of the GNU General Public +* License. See the file "COPYING" in the main directory of this archive +* for more details. +* +* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. +* Authors: Sanjay Lal +*/ + +#if !defined(_TRACE_KVM_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_KVM_H + +#include + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM kvm +#define TRACE_INCLUDE_PATH . +#define TRACE_INCLUDE_FILE trace + +/* + * Tracepoints for VM eists + */ +extern char *kvm_mips_exit_types_str[MAX_KVM_MIPS_EXIT_TYPES]; + +TRACE_EVENT(kvm_exit, + TP_PROTO(struct kvm_vcpu *vcpu, unsigned int reason), + TP_ARGS(vcpu, reason), + TP_STRUCT__entry( + __field(struct kvm_vcpu *, vcpu) + __field(unsigned int, reason) + ), + + TP_fast_assign( + __entry->vcpu = vcpu; + __entry->reason = reason; + ), + + TP_printk("[%s]PC: 0x%08lx", + kvm_mips_exit_types_str[__entry->reason], + __entry->vcpu->arch.pc) +); + +#endif /* _TRACE_KVM_H */ + +/* This part must be outside protection */ +#include diff --git a/arch/mips/lib/bitops.c b/arch/mips/lib/bitops.c index a64daee740ee..3b2a1e78a543 100644 --- a/arch/mips/lib/bitops.c +++ b/arch/mips/lib/bitops.c @@ -19,7 +19,7 @@ */ void __mips_set_bit(unsigned long nr, volatile unsigned long *addr) { - volatile unsigned long *a = addr; + unsigned long *a = (unsigned long *)addr; unsigned bit = nr & SZLONG_MASK; unsigned long mask; unsigned long flags; @@ -41,7 +41,7 @@ EXPORT_SYMBOL(__mips_set_bit); */ void __mips_clear_bit(unsigned long nr, volatile unsigned long *addr) { - volatile unsigned long *a = addr; + unsigned long *a = (unsigned long *)addr; unsigned bit = nr & SZLONG_MASK; unsigned long mask; unsigned long flags; @@ -63,7 +63,7 @@ EXPORT_SYMBOL(__mips_clear_bit); */ void __mips_change_bit(unsigned long nr, volatile unsigned long *addr) { - volatile unsigned long *a = addr; + unsigned long *a = (unsigned long *)addr; unsigned bit = nr & SZLONG_MASK; unsigned long mask; unsigned long flags; @@ -86,7 +86,7 @@ EXPORT_SYMBOL(__mips_change_bit); int __mips_test_and_set_bit(unsigned long nr, volatile unsigned long *addr) { - volatile unsigned long *a = addr; + unsigned long *a = (unsigned long *)addr; unsigned bit = nr & SZLONG_MASK; unsigned long mask; unsigned long flags; @@ -112,7 +112,7 @@ EXPORT_SYMBOL(__mips_test_and_set_bit); int __mips_test_and_set_bit_lock(unsigned long nr, volatile unsigned long *addr) { - volatile unsigned long *a = addr; + unsigned long *a = (unsigned long *)addr; unsigned bit = nr & SZLONG_MASK; unsigned long mask; unsigned long flags; @@ -137,7 +137,7 @@ EXPORT_SYMBOL(__mips_test_and_set_bit_lock); */ int __mips_test_and_clear_bit(unsigned long nr, volatile unsigned long *addr) { - volatile unsigned long *a = addr; + unsigned long *a = (unsigned long *)addr; unsigned bit = nr & SZLONG_MASK; unsigned long mask; unsigned long flags; @@ -162,7 +162,7 @@ EXPORT_SYMBOL(__mips_test_and_clear_bit); */ int __mips_test_and_change_bit(unsigned long nr, volatile unsigned long *addr) { - volatile unsigned long *a = addr; + unsigned long *a = (unsigned long *)addr; unsigned bit = nr & SZLONG_MASK; unsigned long mask; unsigned long flags; diff --git a/arch/mips/lib/dump_tlb.c b/arch/mips/lib/dump_tlb.c index 32b9f21bfd85..8a12d00908e0 100644 --- a/arch/mips/lib/dump_tlb.c +++ b/arch/mips/lib/dump_tlb.c @@ -11,6 +11,7 @@ #include #include #include +#include static inline const char *msk2str(unsigned int mask) { @@ -55,7 +56,7 @@ static void dump_tlb(int first, int last) s_pagemask = read_c0_pagemask(); s_entryhi = read_c0_entryhi(); s_index = read_c0_index(); - asid = s_entryhi & 0xff; + asid = ASID_MASK(s_entryhi); for (i = first; i <= last; i++) { write_c0_index(i); @@ -85,7 +86,7 @@ static void dump_tlb(int first, int last) printk("va=%0*lx asid=%02lx\n", width, (entryhi & ~0x1fffUL), - entryhi & 0xff); + ASID_MASK(entryhi)); printk("\t[pa=%0*llx c=%d d=%d v=%d g=%d] ", width, (entrylo0 << 6) & PAGE_MASK, c0, diff --git a/arch/mips/lib/memset.S b/arch/mips/lib/memset.S index 053d3b0b0317..0580194e7402 100644 --- a/arch/mips/lib/memset.S +++ b/arch/mips/lib/memset.S @@ -5,7 +5,8 @@ * * Copyright (C) 1998, 1999, 2000 by Ralf Baechle * Copyright (C) 1999, 2000 Silicon Graphics, Inc. - * Copyright (C) 2007 Maciej W. Rozycki + * Copyright (C) 2007 by Maciej W. Rozycki + * Copyright (C) 2011, 2012 MIPS Technologies, Inc. */ #include #include @@ -19,6 +20,20 @@ #define LONG_S_R sdr #endif +#ifdef CONFIG_CPU_MICROMIPS +#define STORSIZE (LONGSIZE * 2) +#define STORMASK (STORSIZE - 1) +#define FILL64RG t8 +#define FILLPTRG t7 +#undef LONG_S +#define LONG_S LONG_SP +#else +#define STORSIZE LONGSIZE +#define STORMASK LONGMASK +#define FILL64RG a1 +#define FILLPTRG t0 +#endif + #define EX(insn,reg,addr,handler) \ 9: insn reg, addr; \ .section __ex_table,"a"; \ @@ -26,23 +41,25 @@ .previous .macro f_fill64 dst, offset, val, fixup - EX(LONG_S, \val, (\offset + 0 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 1 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 2 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 3 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 4 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 5 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 6 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 7 * LONGSIZE)(\dst), \fixup) -#if LONGSIZE == 4 - EX(LONG_S, \val, (\offset + 8 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 9 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 10 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 11 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 12 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 13 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 14 * LONGSIZE)(\dst), \fixup) - EX(LONG_S, \val, (\offset + 15 * LONGSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 0 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 1 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 2 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 3 * STORSIZE)(\dst), \fixup) +#if ((defined(CONFIG_CPU_MICROMIPS) && (LONGSIZE == 4)) || !defined(CONFIG_CPU_MICROMIPS)) + EX(LONG_S, \val, (\offset + 4 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 5 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 6 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 7 * STORSIZE)(\dst), \fixup) +#endif +#if (!defined(CONFIG_CPU_MICROMIPS) && (LONGSIZE == 4)) + EX(LONG_S, \val, (\offset + 8 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 9 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 10 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 11 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 12 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 13 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 14 * STORSIZE)(\dst), \fixup) + EX(LONG_S, \val, (\offset + 15 * STORSIZE)(\dst), \fixup) #endif .endm @@ -71,16 +88,20 @@ LEAF(memset) 1: FEXPORT(__bzero) - sltiu t0, a2, LONGSIZE /* very small region? */ + sltiu t0, a2, STORSIZE /* very small region? */ bnez t0, .Lsmall_memset - andi t0, a0, LONGMASK /* aligned? */ + andi t0, a0, STORMASK /* aligned? */ +#ifdef CONFIG_CPU_MICROMIPS + move t8, a1 /* used by 'swp' instruction */ + move t9, a1 +#endif #ifndef CONFIG_CPU_DADDI_WORKAROUNDS beqz t0, 1f - PTR_SUBU t0, LONGSIZE /* alignment in bytes */ + PTR_SUBU t0, STORSIZE /* alignment in bytes */ #else .set noat - li AT, LONGSIZE + li AT, STORSIZE beqz t0, 1f PTR_SUBU t0, AT /* alignment in bytes */ .set at @@ -99,24 +120,27 @@ FEXPORT(__bzero) 1: ori t1, a2, 0x3f /* # of full blocks */ xori t1, 0x3f beqz t1, .Lmemset_partial /* no block to fill */ - andi t0, a2, 0x40-LONGSIZE + andi t0, a2, 0x40-STORSIZE PTR_ADDU t1, a0 /* end address */ .set reorder 1: PTR_ADDIU a0, 64 R10KCBARRIER(0(ra)) - f_fill64 a0, -64, a1, .Lfwd_fixup + f_fill64 a0, -64, FILL64RG, .Lfwd_fixup bne t1, a0, 1b .set noreorder .Lmemset_partial: R10KCBARRIER(0(ra)) PTR_LA t1, 2f /* where to start */ +#ifdef CONFIG_CPU_MICROMIPS + LONG_SRL t7, t0, 1 +#endif #if LONGSIZE == 4 - PTR_SUBU t1, t0 + PTR_SUBU t1, FILLPTRG #else .set noat - LONG_SRL AT, t0, 1 + LONG_SRL AT, FILLPTRG, 1 PTR_SUBU t1, AT .set at #endif @@ -126,9 +150,9 @@ FEXPORT(__bzero) .set push .set noreorder .set nomacro - f_fill64 a0, -64, a1, .Lpartial_fixup /* ... but first do longs ... */ + f_fill64 a0, -64, FILL64RG, .Lpartial_fixup /* ... but first do longs ... */ 2: .set pop - andi a2, LONGMASK /* At most one long to go */ + andi a2, STORMASK /* At most one long to go */ beqz a2, 1f PTR_ADDU a0, a2 /* What's left */ @@ -169,7 +193,7 @@ FEXPORT(__bzero) .Lpartial_fixup: PTR_L t0, TI_TASK($28) - andi a2, LONGMASK + andi a2, STORMASK LONG_L t0, THREAD_BUADDR(t0) LONG_ADDU a2, t1 jr ra @@ -177,4 +201,4 @@ FEXPORT(__bzero) .Llast_fixup: jr ra - andi v1, a2, LONGMASK + andi v1, a2, STORMASK diff --git a/arch/mips/lib/mips-atomic.c b/arch/mips/lib/mips-atomic.c index cd160be3ce4d..6807f7172eaf 100644 --- a/arch/mips/lib/mips-atomic.c +++ b/arch/mips/lib/mips-atomic.c @@ -13,6 +13,7 @@ #include #include #include +#include #if !defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_MIPS_MT_SMTC) @@ -34,8 +35,11 @@ * * Workaround: mask EXL bit of the result or place a nop before mfc0. */ -__asm__( - " .macro arch_local_irq_disable\n" +notrace void arch_local_irq_disable(void) +{ + preempt_disable(); + + __asm__ __volatile__( " .set push \n" " .set noat \n" #ifdef CONFIG_MIPS_MT_SMTC @@ -52,108 +56,98 @@ __asm__( " .set noreorder \n" " mtc0 $1,$12 \n" #endif - " irq_disable_hazard \n" + " " __stringify(__irq_disable_hazard) " \n" " .set pop \n" - " .endm \n"); + : /* no outputs */ + : /* no inputs */ + : "memory"); -notrace void arch_local_irq_disable(void) -{ - preempt_disable(); - __asm__ __volatile__( - "arch_local_irq_disable" - : /* no outputs */ - : /* no inputs */ - : "memory"); preempt_enable(); } EXPORT_SYMBOL(arch_local_irq_disable); -__asm__( - " .macro arch_local_irq_save result \n" +notrace unsigned long arch_local_irq_save(void) +{ + unsigned long flags; + + preempt_disable(); + + __asm__ __volatile__( " .set push \n" " .set reorder \n" " .set noat \n" #ifdef CONFIG_MIPS_MT_SMTC - " mfc0 \\result, $2, 1 \n" - " ori $1, \\result, 0x400 \n" + " mfc0 %[flags], $2, 1 \n" + " ori $1, %[flags], 0x400 \n" " .set noreorder \n" " mtc0 $1, $2, 1 \n" - " andi \\result, \\result, 0x400 \n" + " andi %[flags], %[flags], 0x400 \n" #elif defined(CONFIG_CPU_MIPSR2) /* see irqflags.h for inline function */ #else - " mfc0 \\result, $12 \n" - " ori $1, \\result, 0x1f \n" + " mfc0 %[flags], $12 \n" + " ori $1, %[flags], 0x1f \n" " xori $1, 0x1f \n" " .set noreorder \n" " mtc0 $1, $12 \n" #endif - " irq_disable_hazard \n" + " " __stringify(__irq_disable_hazard) " \n" " .set pop \n" - " .endm \n"); + : [flags] "=r" (flags) + : /* no inputs */ + : "memory"); -notrace unsigned long arch_local_irq_save(void) -{ - unsigned long flags; - preempt_disable(); - asm volatile("arch_local_irq_save\t%0" - : "=r" (flags) - : /* no inputs */ - : "memory"); preempt_enable(); + return flags; } EXPORT_SYMBOL(arch_local_irq_save); +notrace void arch_local_irq_restore(unsigned long flags) +{ + unsigned long __tmp1; + +#ifdef CONFIG_MIPS_MT_SMTC + /* + * SMTC kernel needs to do a software replay of queued + * IPIs, at the cost of branch and call overhead on each + * local_irq_restore() + */ + if (unlikely(!(flags & 0x0400))) + smtc_ipi_replay(); +#endif + preempt_disable(); -__asm__( - " .macro arch_local_irq_restore flags \n" + __asm__ __volatile__( " .set push \n" " .set noreorder \n" " .set noat \n" #ifdef CONFIG_MIPS_MT_SMTC - "mfc0 $1, $2, 1 \n" - "andi \\flags, 0x400 \n" - "ori $1, 0x400 \n" - "xori $1, 0x400 \n" - "or \\flags, $1 \n" - "mtc0 \\flags, $2, 1 \n" + " mfc0 $1, $2, 1 \n" + " andi %[flags], 0x400 \n" + " ori $1, 0x400 \n" + " xori $1, 0x400 \n" + " or %[flags], $1 \n" + " mtc0 %[flags], $2, 1 \n" #elif defined(CONFIG_CPU_MIPSR2) && defined(CONFIG_IRQ_CPU) /* see irqflags.h for inline function */ #elif defined(CONFIG_CPU_MIPSR2) /* see irqflags.h for inline function */ #else " mfc0 $1, $12 \n" - " andi \\flags, 1 \n" + " andi %[flags], 1 \n" " ori $1, 0x1f \n" " xori $1, 0x1f \n" - " or \\flags, $1 \n" - " mtc0 \\flags, $12 \n" + " or %[flags], $1 \n" + " mtc0 %[flags], $12 \n" #endif - " irq_disable_hazard \n" + " " __stringify(__irq_disable_hazard) " \n" " .set pop \n" - " .endm \n"); + : [flags] "=r" (__tmp1) + : "0" (flags) + : "memory"); -notrace void arch_local_irq_restore(unsigned long flags) -{ - unsigned long __tmp1; - -#ifdef CONFIG_MIPS_MT_SMTC - /* - * SMTC kernel needs to do a software replay of queued - * IPIs, at the cost of branch and call overhead on each - * local_irq_restore() - */ - if (unlikely(!(flags & 0x0400))) - smtc_ipi_replay(); -#endif - preempt_disable(); - __asm__ __volatile__( - "arch_local_irq_restore\t%0" - : "=r" (__tmp1) - : "0" (flags) - : "memory"); preempt_enable(); } EXPORT_SYMBOL(arch_local_irq_restore); @@ -164,11 +158,36 @@ notrace void __arch_local_irq_restore(unsigned long flags) unsigned long __tmp1; preempt_disable(); + __asm__ __volatile__( - "arch_local_irq_restore\t%0" - : "=r" (__tmp1) - : "0" (flags) - : "memory"); + " .set push \n" + " .set noreorder \n" + " .set noat \n" +#ifdef CONFIG_MIPS_MT_SMTC + " mfc0 $1, $2, 1 \n" + " andi %[flags], 0x400 \n" + " ori $1, 0x400 \n" + " xori $1, 0x400 \n" + " or %[flags], $1 \n" + " mtc0 %[flags], $2, 1 \n" +#elif defined(CONFIG_CPU_MIPSR2) && defined(CONFIG_IRQ_CPU) + /* see irqflags.h for inline function */ +#elif defined(CONFIG_CPU_MIPSR2) + /* see irqflags.h for inline function */ +#else + " mfc0 $1, $12 \n" + " andi %[flags], 1 \n" + " ori $1, 0x1f \n" + " xori $1, 0x1f \n" + " or %[flags], $1 \n" + " mtc0 %[flags], $12 \n" +#endif + " " __stringify(__irq_disable_hazard) " \n" + " .set pop \n" + : [flags] "=r" (__tmp1) + : "0" (flags) + : "memory"); + preempt_enable(); } EXPORT_SYMBOL(__arch_local_irq_restore); diff --git a/arch/mips/lib/r3k_dump_tlb.c b/arch/mips/lib/r3k_dump_tlb.c index 91615c2ef0cf..8327698b9937 100644 --- a/arch/mips/lib/r3k_dump_tlb.c +++ b/arch/mips/lib/r3k_dump_tlb.c @@ -9,6 +9,7 @@ #include #include +#include #include #include #include @@ -21,7 +22,7 @@ static void dump_tlb(int first, int last) unsigned int asid; unsigned long entryhi, entrylo0; - asid = read_c0_entryhi() & 0xfc0; + asid = ASID_MASK(read_c0_entryhi()); for (i = first; i <= last; i++) { write_c0_index(i<<8); @@ -35,7 +36,7 @@ static void dump_tlb(int first, int last) /* Unused entries have a virtual address of KSEG0. */ if ((entryhi & 0xffffe000) != 0x80000000 - && (entryhi & 0xfc0) == asid) { + && (ASID_MASK(entryhi) == asid)) { /* * Only print entries in use */ @@ -44,7 +45,7 @@ static void dump_tlb(int first, int last) printk("va=%08lx asid=%08lx" " [pa=%06lx n=%d d=%d v=%d g=%d]", (entryhi & 0xffffe000), - entryhi & 0xfc0, + ASID_MASK(entryhi), entrylo0 & PAGE_MASK, (entrylo0 & (1 << 11)) ? 1 : 0, (entrylo0 & (1 << 10)) ? 1 : 0, diff --git a/arch/mips/lib/strlen_user.S b/arch/mips/lib/strlen_user.S index fdbb970f670d..e362dcdc69d1 100644 --- a/arch/mips/lib/strlen_user.S +++ b/arch/mips/lib/strlen_user.S @@ -3,8 +3,9 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (c) 1996, 1998, 1999, 2004 by Ralf Baechle - * Copyright (c) 1999 Silicon Graphics, Inc. + * Copyright (C) 1996, 1998, 1999, 2004 by Ralf Baechle + * Copyright (C) 1999 Silicon Graphics, Inc. + * Copyright (C) 2011 MIPS Technologies, Inc. */ #include #include @@ -28,9 +29,9 @@ LEAF(__strlen_user_asm) FEXPORT(__strlen_user_nocheck_asm) move v0, a0 -1: EX(lb, t0, (v0), .Lfault) +1: EX(lbu, v1, (v0), .Lfault) PTR_ADDIU v0, 1 - bnez t0, 1b + bnez v1, 1b PTR_SUBU v0, a0 jr ra END(__strlen_user_asm) diff --git a/arch/mips/lib/strncpy_user.S b/arch/mips/lib/strncpy_user.S index bad539487503..92870b6b53ea 100644 --- a/arch/mips/lib/strncpy_user.S +++ b/arch/mips/lib/strncpy_user.S @@ -3,7 +3,8 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (c) 1996, 1999 by Ralf Baechle + * Copyright (C) 1996, 1999 by Ralf Baechle + * Copyright (C) 2011 MIPS Technologies, Inc. */ #include #include @@ -33,26 +34,27 @@ LEAF(__strncpy_from_user_asm) bnez v0, .Lfault FEXPORT(__strncpy_from_user_nocheck_asm) - move v0, zero - move v1, a1 .set noreorder -1: EX(lbu, t0, (v1), .Lfault) + move t0, zero + move v1, a1 +1: EX(lbu, v0, (v1), .Lfault) PTR_ADDIU v1, 1 R10KCBARRIER(0(ra)) - beqz t0, 2f - sb t0, (a0) - PTR_ADDIU v0, 1 - .set reorder - PTR_ADDIU a0, 1 - bne v0, a2, 1b -2: PTR_ADDU t0, a1, v0 - xor t0, a1 - bltz t0, .Lfault + beqz v0, 2f + sb v0, (a0) + PTR_ADDIU t0, 1 + bne t0, a2, 1b + PTR_ADDIU a0, 1 +2: PTR_ADDU v0, a1, t0 + xor v0, a1 + bltz v0, .Lfault + nop jr ra # return n + move v0, t0 END(__strncpy_from_user_asm) -.Lfault: li v0, -EFAULT - jr ra +.Lfault: jr ra + li v0, -EFAULT .section __ex_table,"a" PTR 1b, .Lfault diff --git a/arch/mips/lib/strnlen_user.S b/arch/mips/lib/strnlen_user.S index beea03c8c0ce..fcacea5e61f1 100644 --- a/arch/mips/lib/strnlen_user.S +++ b/arch/mips/lib/strnlen_user.S @@ -35,7 +35,7 @@ FEXPORT(__strnlen_user_nocheck_asm) PTR_ADDU a1, a0 # stop pointer 1: beq v0, a1, 1f # limit reached? EX(lb, t0, (v0), .Lfault) - PTR_ADDU v0, 1 + PTR_ADDIU v0, 1 bnez t0, 1b 1: PTR_SUBU v0, a0 jr ra diff --git a/arch/mips/loongson/common/Makefile b/arch/mips/loongson/common/Makefile index e526488df655..4c57b3e5743f 100644 --- a/arch/mips/loongson/common/Makefile +++ b/arch/mips/loongson/common/Makefile @@ -4,7 +4,7 @@ obj-y += setup.o init.o cmdline.o env.o time.o reset.o irq.o \ pci.o bonito-irq.o mem.o machtype.o platform.o -obj-$(CONFIG_GENERIC_GPIO) += gpio.o +obj-$(CONFIG_GPIOLIB) += gpio.o # # Serial port support diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c index afb5a0bcf7a5..f03771900813 100644 --- a/arch/mips/math-emu/cp1emu.c +++ b/arch/mips/math-emu/cp1emu.c @@ -45,6 +45,7 @@ #include #include #include +#include #include #include @@ -81,6 +82,11 @@ DEFINE_PER_CPU(struct mips_fpu_emulator_stats, fpuemustats); /* Determine rounding mode from the RM bits of the FCSR */ #define modeindex(v) ((v) & FPU_CSR_RM) +/* microMIPS bitfields */ +#define MM_POOL32A_MINOR_MASK 0x3f +#define MM_POOL32A_MINOR_SHIFT 0x6 +#define MM_MIPS32_COND_FC 0x30 + /* Convert Mips rounding mode (0..3) to IEEE library modes. */ static const unsigned char ieee_rm[4] = { [FPU_CSR_RN] = IEEE754_RN, @@ -110,6 +116,556 @@ static const unsigned int fpucondbit[8] = { }; #endif +/* (microMIPS) Convert 16-bit register encoding to 32-bit register encoding. */ +static const unsigned int reg16to32map[8] = {16, 17, 2, 3, 4, 5, 6, 7}; + +/* (microMIPS) Convert certain microMIPS instructions to MIPS32 format. */ +static const int sd_format[] = {16, 17, 0, 0, 0, 0, 0, 0}; +static const int sdps_format[] = {16, 17, 22, 0, 0, 0, 0, 0}; +static const int dwl_format[] = {17, 20, 21, 0, 0, 0, 0, 0}; +static const int swl_format[] = {16, 20, 21, 0, 0, 0, 0, 0}; + +/* + * This functions translates a 32-bit microMIPS instruction + * into a 32-bit MIPS32 instruction. Returns 0 on success + * and SIGILL otherwise. + */ +static int microMIPS32_to_MIPS32(union mips_instruction *insn_ptr) +{ + union mips_instruction insn = *insn_ptr; + union mips_instruction mips32_insn = insn; + int func, fmt, op; + + switch (insn.mm_i_format.opcode) { + case mm_ldc132_op: + mips32_insn.mm_i_format.opcode = ldc1_op; + mips32_insn.mm_i_format.rt = insn.mm_i_format.rs; + mips32_insn.mm_i_format.rs = insn.mm_i_format.rt; + break; + case mm_lwc132_op: + mips32_insn.mm_i_format.opcode = lwc1_op; + mips32_insn.mm_i_format.rt = insn.mm_i_format.rs; + mips32_insn.mm_i_format.rs = insn.mm_i_format.rt; + break; + case mm_sdc132_op: + mips32_insn.mm_i_format.opcode = sdc1_op; + mips32_insn.mm_i_format.rt = insn.mm_i_format.rs; + mips32_insn.mm_i_format.rs = insn.mm_i_format.rt; + break; + case mm_swc132_op: + mips32_insn.mm_i_format.opcode = swc1_op; + mips32_insn.mm_i_format.rt = insn.mm_i_format.rs; + mips32_insn.mm_i_format.rs = insn.mm_i_format.rt; + break; + case mm_pool32i_op: + /* NOTE: offset is << by 1 if in microMIPS mode. */ + if ((insn.mm_i_format.rt == mm_bc1f_op) || + (insn.mm_i_format.rt == mm_bc1t_op)) { + mips32_insn.fb_format.opcode = cop1_op; + mips32_insn.fb_format.bc = bc_op; + mips32_insn.fb_format.flag = + (insn.mm_i_format.rt == mm_bc1t_op) ? 1 : 0; + } else + return SIGILL; + break; + case mm_pool32f_op: + switch (insn.mm_fp0_format.func) { + case mm_32f_01_op: + case mm_32f_11_op: + case mm_32f_02_op: + case mm_32f_12_op: + case mm_32f_41_op: + case mm_32f_51_op: + case mm_32f_42_op: + case mm_32f_52_op: + op = insn.mm_fp0_format.func; + if (op == mm_32f_01_op) + func = madd_s_op; + else if (op == mm_32f_11_op) + func = madd_d_op; + else if (op == mm_32f_02_op) + func = nmadd_s_op; + else if (op == mm_32f_12_op) + func = nmadd_d_op; + else if (op == mm_32f_41_op) + func = msub_s_op; + else if (op == mm_32f_51_op) + func = msub_d_op; + else if (op == mm_32f_42_op) + func = nmsub_s_op; + else + func = nmsub_d_op; + mips32_insn.fp6_format.opcode = cop1x_op; + mips32_insn.fp6_format.fr = insn.mm_fp6_format.fr; + mips32_insn.fp6_format.ft = insn.mm_fp6_format.ft; + mips32_insn.fp6_format.fs = insn.mm_fp6_format.fs; + mips32_insn.fp6_format.fd = insn.mm_fp6_format.fd; + mips32_insn.fp6_format.func = func; + break; + case mm_32f_10_op: + func = -1; /* Invalid */ + op = insn.mm_fp5_format.op & 0x7; + if (op == mm_ldxc1_op) + func = ldxc1_op; + else if (op == mm_sdxc1_op) + func = sdxc1_op; + else if (op == mm_lwxc1_op) + func = lwxc1_op; + else if (op == mm_swxc1_op) + func = swxc1_op; + + if (func != -1) { + mips32_insn.r_format.opcode = cop1x_op; + mips32_insn.r_format.rs = + insn.mm_fp5_format.base; + mips32_insn.r_format.rt = + insn.mm_fp5_format.index; + mips32_insn.r_format.rd = 0; + mips32_insn.r_format.re = insn.mm_fp5_format.fd; + mips32_insn.r_format.func = func; + } else + return SIGILL; + break; + case mm_32f_40_op: + op = -1; /* Invalid */ + if (insn.mm_fp2_format.op == mm_fmovt_op) + op = 1; + else if (insn.mm_fp2_format.op == mm_fmovf_op) + op = 0; + if (op != -1) { + mips32_insn.fp0_format.opcode = cop1_op; + mips32_insn.fp0_format.fmt = + sdps_format[insn.mm_fp2_format.fmt]; + mips32_insn.fp0_format.ft = + (insn.mm_fp2_format.cc<<2) + op; + mips32_insn.fp0_format.fs = + insn.mm_fp2_format.fs; + mips32_insn.fp0_format.fd = + insn.mm_fp2_format.fd; + mips32_insn.fp0_format.func = fmovc_op; + } else + return SIGILL; + break; + case mm_32f_60_op: + func = -1; /* Invalid */ + if (insn.mm_fp0_format.op == mm_fadd_op) + func = fadd_op; + else if (insn.mm_fp0_format.op == mm_fsub_op) + func = fsub_op; + else if (insn.mm_fp0_format.op == mm_fmul_op) + func = fmul_op; + else if (insn.mm_fp0_format.op == mm_fdiv_op) + func = fdiv_op; + if (func != -1) { + mips32_insn.fp0_format.opcode = cop1_op; + mips32_insn.fp0_format.fmt = + sdps_format[insn.mm_fp0_format.fmt]; + mips32_insn.fp0_format.ft = + insn.mm_fp0_format.ft; + mips32_insn.fp0_format.fs = + insn.mm_fp0_format.fs; + mips32_insn.fp0_format.fd = + insn.mm_fp0_format.fd; + mips32_insn.fp0_format.func = func; + } else + return SIGILL; + break; + case mm_32f_70_op: + func = -1; /* Invalid */ + if (insn.mm_fp0_format.op == mm_fmovn_op) + func = fmovn_op; + else if (insn.mm_fp0_format.op == mm_fmovz_op) + func = fmovz_op; + if (func != -1) { + mips32_insn.fp0_format.opcode = cop1_op; + mips32_insn.fp0_format.fmt = + sdps_format[insn.mm_fp0_format.fmt]; + mips32_insn.fp0_format.ft = + insn.mm_fp0_format.ft; + mips32_insn.fp0_format.fs = + insn.mm_fp0_format.fs; + mips32_insn.fp0_format.fd = + insn.mm_fp0_format.fd; + mips32_insn.fp0_format.func = func; + } else + return SIGILL; + break; + case mm_32f_73_op: /* POOL32FXF */ + switch (insn.mm_fp1_format.op) { + case mm_movf0_op: + case mm_movf1_op: + case mm_movt0_op: + case mm_movt1_op: + if ((insn.mm_fp1_format.op & 0x7f) == + mm_movf0_op) + op = 0; + else + op = 1; + mips32_insn.r_format.opcode = spec_op; + mips32_insn.r_format.rs = insn.mm_fp4_format.fs; + mips32_insn.r_format.rt = + (insn.mm_fp4_format.cc << 2) + op; + mips32_insn.r_format.rd = insn.mm_fp4_format.rt; + mips32_insn.r_format.re = 0; + mips32_insn.r_format.func = movc_op; + break; + case mm_fcvtd0_op: + case mm_fcvtd1_op: + case mm_fcvts0_op: + case mm_fcvts1_op: + if ((insn.mm_fp1_format.op & 0x7f) == + mm_fcvtd0_op) { + func = fcvtd_op; + fmt = swl_format[insn.mm_fp3_format.fmt]; + } else { + func = fcvts_op; + fmt = dwl_format[insn.mm_fp3_format.fmt]; + } + mips32_insn.fp0_format.opcode = cop1_op; + mips32_insn.fp0_format.fmt = fmt; + mips32_insn.fp0_format.ft = 0; + mips32_insn.fp0_format.fs = + insn.mm_fp3_format.fs; + mips32_insn.fp0_format.fd = + insn.mm_fp3_format.rt; + mips32_insn.fp0_format.func = func; + break; + case mm_fmov0_op: + case mm_fmov1_op: + case mm_fabs0_op: + case mm_fabs1_op: + case mm_fneg0_op: + case mm_fneg1_op: + if ((insn.mm_fp1_format.op & 0x7f) == + mm_fmov0_op) + func = fmov_op; + else if ((insn.mm_fp1_format.op & 0x7f) == + mm_fabs0_op) + func = fabs_op; + else + func = fneg_op; + mips32_insn.fp0_format.opcode = cop1_op; + mips32_insn.fp0_format.fmt = + sdps_format[insn.mm_fp3_format.fmt]; + mips32_insn.fp0_format.ft = 0; + mips32_insn.fp0_format.fs = + insn.mm_fp3_format.fs; + mips32_insn.fp0_format.fd = + insn.mm_fp3_format.rt; + mips32_insn.fp0_format.func = func; + break; + case mm_ffloorl_op: + case mm_ffloorw_op: + case mm_fceill_op: + case mm_fceilw_op: + case mm_ftruncl_op: + case mm_ftruncw_op: + case mm_froundl_op: + case mm_froundw_op: + case mm_fcvtl_op: + case mm_fcvtw_op: + if (insn.mm_fp1_format.op == mm_ffloorl_op) + func = ffloorl_op; + else if (insn.mm_fp1_format.op == mm_ffloorw_op) + func = ffloor_op; + else if (insn.mm_fp1_format.op == mm_fceill_op) + func = fceill_op; + else if (insn.mm_fp1_format.op == mm_fceilw_op) + func = fceil_op; + else if (insn.mm_fp1_format.op == mm_ftruncl_op) + func = ftruncl_op; + else if (insn.mm_fp1_format.op == mm_ftruncw_op) + func = ftrunc_op; + else if (insn.mm_fp1_format.op == mm_froundl_op) + func = froundl_op; + else if (insn.mm_fp1_format.op == mm_froundw_op) + func = fround_op; + else if (insn.mm_fp1_format.op == mm_fcvtl_op) + func = fcvtl_op; + else + func = fcvtw_op; + mips32_insn.fp0_format.opcode = cop1_op; + mips32_insn.fp0_format.fmt = + sd_format[insn.mm_fp1_format.fmt]; + mips32_insn.fp0_format.ft = 0; + mips32_insn.fp0_format.fs = + insn.mm_fp1_format.fs; + mips32_insn.fp0_format.fd = + insn.mm_fp1_format.rt; + mips32_insn.fp0_format.func = func; + break; + case mm_frsqrt_op: + case mm_fsqrt_op: + case mm_frecip_op: + if (insn.mm_fp1_format.op == mm_frsqrt_op) + func = frsqrt_op; + else if (insn.mm_fp1_format.op == mm_fsqrt_op) + func = fsqrt_op; + else + func = frecip_op; + mips32_insn.fp0_format.opcode = cop1_op; + mips32_insn.fp0_format.fmt = + sdps_format[insn.mm_fp1_format.fmt]; + mips32_insn.fp0_format.ft = 0; + mips32_insn.fp0_format.fs = + insn.mm_fp1_format.fs; + mips32_insn.fp0_format.fd = + insn.mm_fp1_format.rt; + mips32_insn.fp0_format.func = func; + break; + case mm_mfc1_op: + case mm_mtc1_op: + case mm_cfc1_op: + case mm_ctc1_op: + if (insn.mm_fp1_format.op == mm_mfc1_op) + op = mfc_op; + else if (insn.mm_fp1_format.op == mm_mtc1_op) + op = mtc_op; + else if (insn.mm_fp1_format.op == mm_cfc1_op) + op = cfc_op; + else + op = ctc_op; + mips32_insn.fp1_format.opcode = cop1_op; + mips32_insn.fp1_format.op = op; + mips32_insn.fp1_format.rt = + insn.mm_fp1_format.rt; + mips32_insn.fp1_format.fs = + insn.mm_fp1_format.fs; + mips32_insn.fp1_format.fd = 0; + mips32_insn.fp1_format.func = 0; + break; + default: + return SIGILL; + break; + } + break; + case mm_32f_74_op: /* c.cond.fmt */ + mips32_insn.fp0_format.opcode = cop1_op; + mips32_insn.fp0_format.fmt = + sdps_format[insn.mm_fp4_format.fmt]; + mips32_insn.fp0_format.ft = insn.mm_fp4_format.rt; + mips32_insn.fp0_format.fs = insn.mm_fp4_format.fs; + mips32_insn.fp0_format.fd = insn.mm_fp4_format.cc << 2; + mips32_insn.fp0_format.func = + insn.mm_fp4_format.cond | MM_MIPS32_COND_FC; + break; + default: + return SIGILL; + break; + } + break; + default: + return SIGILL; + break; + } + + *insn_ptr = mips32_insn; + return 0; +} + +int mm_isBranchInstr(struct pt_regs *regs, struct mm_decoded_insn dec_insn, + unsigned long *contpc) +{ + union mips_instruction insn = (union mips_instruction)dec_insn.insn; + int bc_false = 0; + unsigned int fcr31; + unsigned int bit; + + switch (insn.mm_i_format.opcode) { + case mm_pool32a_op: + if ((insn.mm_i_format.simmediate & MM_POOL32A_MINOR_MASK) == + mm_pool32axf_op) { + switch (insn.mm_i_format.simmediate >> + MM_POOL32A_MINOR_SHIFT) { + case mm_jalr_op: + case mm_jalrhb_op: + case mm_jalrs_op: + case mm_jalrshb_op: + if (insn.mm_i_format.rt != 0) /* Not mm_jr */ + regs->regs[insn.mm_i_format.rt] = + regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + *contpc = regs->regs[insn.mm_i_format.rs]; + return 1; + break; + } + } + break; + case mm_pool32i_op: + switch (insn.mm_i_format.rt) { + case mm_bltzals_op: + case mm_bltzal_op: + regs->regs[31] = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + /* Fall through */ + case mm_bltz_op: + if ((long)regs->regs[insn.mm_i_format.rs] < 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_i_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case mm_bgezals_op: + case mm_bgezal_op: + regs->regs[31] = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + /* Fall through */ + case mm_bgez_op: + if ((long)regs->regs[insn.mm_i_format.rs] >= 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_i_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case mm_blez_op: + if ((long)regs->regs[insn.mm_i_format.rs] <= 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_i_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case mm_bgtz_op: + if ((long)regs->regs[insn.mm_i_format.rs] <= 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_i_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case mm_bc2f_op: + case mm_bc1f_op: + bc_false = 1; + /* Fall through */ + case mm_bc2t_op: + case mm_bc1t_op: + preempt_disable(); + if (is_fpu_owner()) + asm volatile("cfc1\t%0,$31" : "=r" (fcr31)); + else + fcr31 = current->thread.fpu.fcr31; + preempt_enable(); + + if (bc_false) + fcr31 = ~fcr31; + + bit = (insn.mm_i_format.rs >> 2); + bit += (bit != 0); + bit += 23; + if (fcr31 & (1 << bit)) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_i_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + dec_insn.next_pc_inc; + return 1; + break; + } + break; + case mm_pool16c_op: + switch (insn.mm_i_format.rt) { + case mm_jalr16_op: + case mm_jalrs16_op: + regs->regs[31] = regs->cp0_epc + + dec_insn.pc_inc + dec_insn.next_pc_inc; + /* Fall through */ + case mm_jr16_op: + *contpc = regs->regs[insn.mm_i_format.rs]; + return 1; + break; + } + break; + case mm_beqz16_op: + if ((long)regs->regs[reg16to32map[insn.mm_b1_format.rs]] == 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_b1_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + dec_insn.next_pc_inc; + return 1; + break; + case mm_bnez16_op: + if ((long)regs->regs[reg16to32map[insn.mm_b1_format.rs]] != 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_b1_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + dec_insn.next_pc_inc; + return 1; + break; + case mm_b16_op: + *contpc = regs->cp0_epc + dec_insn.pc_inc + + (insn.mm_b0_format.simmediate << 1); + return 1; + break; + case mm_beq32_op: + if (regs->regs[insn.mm_i_format.rs] == + regs->regs[insn.mm_i_format.rt]) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_i_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case mm_bne32_op: + if (regs->regs[insn.mm_i_format.rs] != + regs->regs[insn.mm_i_format.rt]) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.mm_i_format.simmediate << 1); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + dec_insn.next_pc_inc; + return 1; + break; + case mm_jalx32_op: + regs->regs[31] = regs->cp0_epc + + dec_insn.pc_inc + dec_insn.next_pc_inc; + *contpc = regs->cp0_epc + dec_insn.pc_inc; + *contpc >>= 28; + *contpc <<= 28; + *contpc |= (insn.j_format.target << 2); + return 1; + break; + case mm_jals32_op: + case mm_jal32_op: + regs->regs[31] = regs->cp0_epc + + dec_insn.pc_inc + dec_insn.next_pc_inc; + /* Fall through */ + case mm_j32_op: + *contpc = regs->cp0_epc + dec_insn.pc_inc; + *contpc >>= 27; + *contpc <<= 27; + *contpc |= (insn.j_format.target << 1); + set_isa16_mode(*contpc); + return 1; + break; + } + return 0; +} /* * Redundant with logic already in kernel/branch.c, @@ -117,53 +673,177 @@ static const unsigned int fpucondbit[8] = { * a single subroutine should be used across both * modules. */ -static int isBranchInstr(mips_instruction * i) +static int isBranchInstr(struct pt_regs *regs, struct mm_decoded_insn dec_insn, + unsigned long *contpc) { - switch (MIPSInst_OPCODE(*i)) { + union mips_instruction insn = (union mips_instruction)dec_insn.insn; + unsigned int fcr31; + unsigned int bit = 0; + + switch (insn.i_format.opcode) { case spec_op: - switch (MIPSInst_FUNC(*i)) { + switch (insn.r_format.func) { case jalr_op: + regs->regs[insn.r_format.rd] = + regs->cp0_epc + dec_insn.pc_inc + + dec_insn.next_pc_inc; + /* Fall through */ case jr_op: + *contpc = regs->regs[insn.r_format.rs]; return 1; + break; } break; - case bcond_op: - switch (MIPSInst_RT(*i)) { + switch (insn.i_format.rt) { + case bltzal_op: + case bltzall_op: + regs->regs[31] = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + /* Fall through */ case bltz_op: - case bgez_op: case bltzl_op: - case bgezl_op: - case bltzal_op: + if ((long)regs->regs[insn.i_format.rs] < 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.i_format.simmediate << 2); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; case bgezal_op: - case bltzall_op: case bgezall_op: + regs->regs[31] = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + /* Fall through */ + case bgez_op: + case bgezl_op: + if ((long)regs->regs[insn.i_format.rs] >= 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.i_format.simmediate << 2); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; return 1; + break; } break; - - case j_op: - case jal_op: case jalx_op: + set_isa16_mode(bit); + case jal_op: + regs->regs[31] = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + /* Fall through */ + case j_op: + *contpc = regs->cp0_epc + dec_insn.pc_inc; + *contpc >>= 28; + *contpc <<= 28; + *contpc |= (insn.j_format.target << 2); + /* Set microMIPS mode bit: XOR for jalx. */ + *contpc ^= bit; + return 1; + break; case beq_op: - case bne_op: - case blez_op: - case bgtz_op: case beql_op: + if (regs->regs[insn.i_format.rs] == + regs->regs[insn.i_format.rt]) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.i_format.simmediate << 2); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case bne_op: case bnel_op: + if (regs->regs[insn.i_format.rs] != + regs->regs[insn.i_format.rt]) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.i_format.simmediate << 2); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case blez_op: case blezl_op: + if ((long)regs->regs[insn.i_format.rs] <= 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.i_format.simmediate << 2); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case bgtz_op: case bgtzl_op: + if ((long)regs->regs[insn.i_format.rs] > 0) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.i_format.simmediate << 2); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; return 1; - + break; case cop0_op: case cop1_op: case cop2_op: case cop1x_op: - if (MIPSInst_RS(*i) == bc_op) - return 1; + if (insn.i_format.rs == bc_op) { + preempt_disable(); + if (is_fpu_owner()) + asm volatile("cfc1\t%0,$31" : "=r" (fcr31)); + else + fcr31 = current->thread.fpu.fcr31; + preempt_enable(); + + bit = (insn.i_format.rt >> 2); + bit += (bit != 0); + bit += 23; + switch (insn.i_format.rt & 3) { + case 0: /* bc1f */ + case 2: /* bc1fl */ + if (~fcr31 & (1 << bit)) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.i_format.simmediate << 2); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + case 1: /* bc1t */ + case 3: /* bc1tl */ + if (fcr31 & (1 << bit)) + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + (insn.i_format.simmediate << 2); + else + *contpc = regs->cp0_epc + + dec_insn.pc_inc + + dec_insn.next_pc_inc; + return 1; + break; + } + } break; } - return 0; } @@ -210,26 +890,23 @@ static inline int cop1_64bit(struct pt_regs *xcp) */ static int cop1Emulate(struct pt_regs *xcp, struct mips_fpu_struct *ctx, - void *__user *fault_addr) + struct mm_decoded_insn dec_insn, void *__user *fault_addr) { mips_instruction ir; - unsigned long emulpc, contpc; + unsigned long contpc = xcp->cp0_epc + dec_insn.pc_inc; unsigned int cond; - - if (!access_ok(VERIFY_READ, xcp->cp0_epc, sizeof(mips_instruction))) { - MIPS_FPU_EMU_INC_STATS(errors); - *fault_addr = (mips_instruction __user *)xcp->cp0_epc; - return SIGBUS; - } - if (__get_user(ir, (mips_instruction __user *) xcp->cp0_epc)) { - MIPS_FPU_EMU_INC_STATS(errors); - *fault_addr = (mips_instruction __user *)xcp->cp0_epc; - return SIGSEGV; - } + int pc_inc; /* XXX NEC Vr54xx bug workaround */ - if ((xcp->cp0_cause & CAUSEF_BD) && !isBranchInstr(&ir)) - xcp->cp0_cause &= ~CAUSEF_BD; + if (xcp->cp0_cause & CAUSEF_BD) { + if (dec_insn.micro_mips_mode) { + if (!mm_isBranchInstr(xcp, dec_insn, &contpc)) + xcp->cp0_cause &= ~CAUSEF_BD; + } else { + if (!isBranchInstr(xcp, dec_insn, &contpc)) + xcp->cp0_cause &= ~CAUSEF_BD; + } + } if (xcp->cp0_cause & CAUSEF_BD) { /* @@ -244,32 +921,33 @@ static int cop1Emulate(struct pt_regs *xcp, struct mips_fpu_struct *ctx, * Linux MIPS branch emulator operates on context, updating the * cp0_epc. */ - emulpc = xcp->cp0_epc + 4; /* Snapshot emulation target */ + ir = dec_insn.next_insn; /* process delay slot instr */ + pc_inc = dec_insn.next_pc_inc; + } else { + ir = dec_insn.insn; /* process current instr */ + pc_inc = dec_insn.pc_inc; + } - if (__compute_return_epc(xcp) < 0) { -#ifdef CP1DBG - printk("failed to emulate branch at %p\n", - (void *) (xcp->cp0_epc)); -#endif + /* + * Since microMIPS FPU instructios are a subset of MIPS32 FPU + * instructions, we want to convert microMIPS FPU instructions + * into MIPS32 instructions so that we could reuse all of the + * FPU emulation code. + * + * NOTE: We cannot do this for branch instructions since they + * are not a subset. Example: Cannot emulate a 16-bit + * aligned target address with a MIPS32 instruction. + */ + if (dec_insn.micro_mips_mode) { + /* + * If next instruction is a 16-bit instruction, then it + * it cannot be a FPU instruction. This could happen + * since we can be called for non-FPU instructions. + */ + if ((pc_inc == 2) || + (microMIPS32_to_MIPS32((union mips_instruction *)&ir) + == SIGILL)) return SIGILL; - } - if (!access_ok(VERIFY_READ, emulpc, sizeof(mips_instruction))) { - MIPS_FPU_EMU_INC_STATS(errors); - *fault_addr = (mips_instruction __user *)emulpc; - return SIGBUS; - } - if (__get_user(ir, (mips_instruction __user *) emulpc)) { - MIPS_FPU_EMU_INC_STATS(errors); - *fault_addr = (mips_instruction __user *)emulpc; - return SIGSEGV; - } - /* __compute_return_epc() will have updated cp0_epc */ - contpc = xcp->cp0_epc; - /* In order not to confuse ptrace() et al, tweak context */ - xcp->cp0_epc = emulpc - 4; - } else { - emulpc = xcp->cp0_epc; - contpc = xcp->cp0_epc + 4; } emul: @@ -474,22 +1152,35 @@ static int cop1Emulate(struct pt_regs *xcp, struct mips_fpu_struct *ctx, /* branch taken: emulate dslot * instruction */ - xcp->cp0_epc += 4; - contpc = (xcp->cp0_epc + - (MIPSInst_SIMM(ir) << 2)); - - if (!access_ok(VERIFY_READ, xcp->cp0_epc, - sizeof(mips_instruction))) { - MIPS_FPU_EMU_INC_STATS(errors); - *fault_addr = (mips_instruction __user *)xcp->cp0_epc; - return SIGBUS; - } - if (__get_user(ir, - (mips_instruction __user *) xcp->cp0_epc)) { - MIPS_FPU_EMU_INC_STATS(errors); - *fault_addr = (mips_instruction __user *)xcp->cp0_epc; - return SIGSEGV; - } + xcp->cp0_epc += dec_insn.pc_inc; + + contpc = MIPSInst_SIMM(ir); + ir = dec_insn.next_insn; + if (dec_insn.micro_mips_mode) { + contpc = (xcp->cp0_epc + (contpc << 1)); + + /* If 16-bit instruction, not FPU. */ + if ((dec_insn.next_pc_inc == 2) || + (microMIPS32_to_MIPS32((union mips_instruction *)&ir) == SIGILL)) { + + /* + * Since this instruction will + * be put on the stack with + * 32-bit words, get around + * this problem by putting a + * NOP16 as the second one. + */ + if (dec_insn.next_pc_inc == 2) + ir = (ir & (~0xffff)) | MM_NOP16; + + /* + * Single step the non-CP1 + * instruction in the dslot. + */ + return mips_dsemul(xcp, ir, contpc); + } + } else + contpc = (xcp->cp0_epc + (contpc << 2)); switch (MIPSInst_OPCODE(ir)) { case lwc1_op: @@ -525,8 +1216,8 @@ static int cop1Emulate(struct pt_regs *xcp, struct mips_fpu_struct *ctx, * branch likely nullifies * dslot if not taken */ - xcp->cp0_epc += 4; - contpc += 4; + xcp->cp0_epc += dec_insn.pc_inc; + contpc += dec_insn.pc_inc; /* * else continue & execute * dslot as normal insn @@ -1313,25 +2004,75 @@ int fpu_emulator_cop1Handler(struct pt_regs *xcp, struct mips_fpu_struct *ctx, int has_fpu, void *__user *fault_addr) { unsigned long oldepc, prevepc; - mips_instruction insn; + struct mm_decoded_insn dec_insn; + u16 instr[4]; + u16 *instr_ptr; int sig = 0; oldepc = xcp->cp0_epc; do { prevepc = xcp->cp0_epc; - if (!access_ok(VERIFY_READ, xcp->cp0_epc, sizeof(mips_instruction))) { - MIPS_FPU_EMU_INC_STATS(errors); - *fault_addr = (mips_instruction __user *)xcp->cp0_epc; - return SIGBUS; - } - if (__get_user(insn, (mips_instruction __user *) xcp->cp0_epc)) { - MIPS_FPU_EMU_INC_STATS(errors); - *fault_addr = (mips_instruction __user *)xcp->cp0_epc; - return SIGSEGV; + if (get_isa16_mode(prevepc) && cpu_has_mmips) { + /* + * Get next 2 microMIPS instructions and convert them + * into 32-bit instructions. + */ + if ((get_user(instr[0], (u16 __user *)msk_isa16_mode(xcp->cp0_epc))) || + (get_user(instr[1], (u16 __user *)msk_isa16_mode(xcp->cp0_epc + 2))) || + (get_user(instr[2], (u16 __user *)msk_isa16_mode(xcp->cp0_epc + 4))) || + (get_user(instr[3], (u16 __user *)msk_isa16_mode(xcp->cp0_epc + 6)))) { + MIPS_FPU_EMU_INC_STATS(errors); + return SIGBUS; + } + instr_ptr = instr; + + /* Get first instruction. */ + if (mm_insn_16bit(*instr_ptr)) { + /* Duplicate the half-word. */ + dec_insn.insn = (*instr_ptr << 16) | + (*instr_ptr); + /* 16-bit instruction. */ + dec_insn.pc_inc = 2; + instr_ptr += 1; + } else { + dec_insn.insn = (*instr_ptr << 16) | + *(instr_ptr+1); + /* 32-bit instruction. */ + dec_insn.pc_inc = 4; + instr_ptr += 2; + } + /* Get second instruction. */ + if (mm_insn_16bit(*instr_ptr)) { + /* Duplicate the half-word. */ + dec_insn.next_insn = (*instr_ptr << 16) | + (*instr_ptr); + /* 16-bit instruction. */ + dec_insn.next_pc_inc = 2; + } else { + dec_insn.next_insn = (*instr_ptr << 16) | + *(instr_ptr+1); + /* 32-bit instruction. */ + dec_insn.next_pc_inc = 4; + } + dec_insn.micro_mips_mode = 1; + } else { + if ((get_user(dec_insn.insn, + (mips_instruction __user *) xcp->cp0_epc)) || + (get_user(dec_insn.next_insn, + (mips_instruction __user *)(xcp->cp0_epc+4)))) { + MIPS_FPU_EMU_INC_STATS(errors); + return SIGBUS; + } + dec_insn.pc_inc = 4; + dec_insn.next_pc_inc = 4; + dec_insn.micro_mips_mode = 0; } - if (insn == 0) - xcp->cp0_epc += 4; /* skip nops */ + + if ((dec_insn.insn == 0) || + ((dec_insn.pc_inc == 2) && + ((dec_insn.insn & 0xffff) == MM_NOP16))) + xcp->cp0_epc += dec_insn.pc_inc; /* Skip NOPs */ else { /* * The 'ieee754_csr' is an alias of @@ -1341,7 +2082,7 @@ int fpu_emulator_cop1Handler(struct pt_regs *xcp, struct mips_fpu_struct *ctx, */ /* convert to ieee library modes */ ieee754_csr.rm = ieee_rm[ieee754_csr.rm]; - sig = cop1Emulate(xcp, ctx, fault_addr); + sig = cop1Emulate(xcp, ctx, dec_insn, fault_addr); /* revert to mips rounding mode */ ieee754_csr.rm = mips_rm[ieee754_csr.rm]; } diff --git a/arch/mips/math-emu/dsemul.c b/arch/mips/math-emu/dsemul.c index 384a3b0091ea..7ea622ab8dad 100644 --- a/arch/mips/math-emu/dsemul.c +++ b/arch/mips/math-emu/dsemul.c @@ -55,7 +55,9 @@ int mips_dsemul(struct pt_regs *regs, mips_instruction ir, unsigned long cpc) struct emuframe __user *fr; int err; - if (ir == 0) { /* a nop is easy */ + if ((get_isa16_mode(regs->cp0_epc) && ((ir >> 16) == MM_NOP16)) || + (ir == 0)) { + /* NOP is easy */ regs->cp0_epc = cpc; regs->cp0_cause &= ~CAUSEF_BD; return 0; @@ -91,8 +93,16 @@ int mips_dsemul(struct pt_regs *regs, mips_instruction ir, unsigned long cpc) if (unlikely(!access_ok(VERIFY_WRITE, fr, sizeof(struct emuframe)))) return SIGBUS; - err = __put_user(ir, &fr->emul); - err |= __put_user((mips_instruction)BREAK_MATH, &fr->badinst); + if (get_isa16_mode(regs->cp0_epc)) { + err = __put_user(ir >> 16, (u16 __user *)(&fr->emul)); + err |= __put_user(ir & 0xffff, (u16 __user *)((long)(&fr->emul) + 2)); + err |= __put_user(BREAK_MATH >> 16, (u16 __user *)(&fr->badinst)); + err |= __put_user(BREAK_MATH & 0xffff, (u16 __user *)((long)(&fr->badinst) + 2)); + } else { + err = __put_user(ir, &fr->emul); + err |= __put_user((mips_instruction)BREAK_MATH, &fr->badinst); + } + err |= __put_user((mips_instruction)BD_COOKIE, &fr->cookie); err |= __put_user(cpc, &fr->epc); @@ -101,7 +111,8 @@ int mips_dsemul(struct pt_regs *regs, mips_instruction ir, unsigned long cpc) return SIGBUS; } - regs->cp0_epc = (unsigned long) &fr->emul; + regs->cp0_epc = ((unsigned long) &fr->emul) | + get_isa16_mode(regs->cp0_epc); flush_cache_sigtramp((unsigned long)&fr->badinst); @@ -114,9 +125,10 @@ int do_dsemulret(struct pt_regs *xcp) unsigned long epc; u32 insn, cookie; int err = 0; + u16 instr[2]; fr = (struct emuframe __user *) - (xcp->cp0_epc - sizeof(mips_instruction)); + (msk_isa16_mode(xcp->cp0_epc) - sizeof(mips_instruction)); /* * If we can't even access the area, something is very wrong, but we'll @@ -131,7 +143,13 @@ int do_dsemulret(struct pt_regs *xcp) * - Is the instruction pointed to by the EPC an BREAK_MATH? * - Is the following memory word the BD_COOKIE? */ - err = __get_user(insn, &fr->badinst); + if (get_isa16_mode(xcp->cp0_epc)) { + err = __get_user(instr[0], (u16 __user *)(&fr->badinst)); + err |= __get_user(instr[1], (u16 __user *)((long)(&fr->badinst) + 2)); + insn = (instr[0] << 16) | instr[1]; + } else { + err = __get_user(insn, &fr->badinst); + } err |= __get_user(cookie, &fr->cookie); if (unlikely(err || (insn != BREAK_MATH) || (cookie != BD_COOKIE))) { diff --git a/arch/mips/mm/Makefile b/arch/mips/mm/Makefile index 1dcec30ad1c4..e87aae1f2e80 100644 --- a/arch/mips/mm/Makefile +++ b/arch/mips/mm/Makefile @@ -4,7 +4,7 @@ obj-y += cache.o dma-default.o extable.o fault.o \ gup.o init.o mmap.o page.o page-funcs.o \ - tlbex.o tlbex-fault.o uasm.o + tlbex.o tlbex-fault.o uasm-mips.o obj-$(CONFIG_32BIT) += ioremap.o pgtable-32.o obj-$(CONFIG_64BIT) += pgtable-64.o @@ -22,3 +22,5 @@ obj-$(CONFIG_IP22_CPU_SCACHE) += sc-ip22.o obj-$(CONFIG_R5000_CPU_SCACHE) += sc-r5k.o obj-$(CONFIG_RM7000_CPU_SCACHE) += sc-rm7k.o obj-$(CONFIG_MIPS_CPU_SCACHE) += sc-mips.o + +obj-$(CONFIG_SYS_SUPPORTS_MICROMIPS) += uasm-micromips.o diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c index 2078915eacb9..21813beec7a5 100644 --- a/arch/mips/mm/c-r4k.c +++ b/arch/mips/mm/c-r4k.c @@ -33,6 +33,7 @@ #include #include /* for run_uncached() */ #include +#include /* * Special Variant of smp_call_function for use by cache functions: @@ -136,7 +137,8 @@ static void __cpuinit r4k_blast_dcache_page_indexed_setup(void) r4k_blast_dcache_page_indexed = blast_dcache64_page_indexed; } -static void (* r4k_blast_dcache)(void); +void (* r4k_blast_dcache)(void); +EXPORT_SYMBOL(r4k_blast_dcache); static void __cpuinit r4k_blast_dcache_setup(void) { @@ -264,7 +266,8 @@ static void __cpuinit r4k_blast_icache_page_indexed_setup(void) r4k_blast_icache_page_indexed = blast_icache64_page_indexed; } -static void (* r4k_blast_icache)(void); +void (* r4k_blast_icache)(void); +EXPORT_SYMBOL(r4k_blast_icache); static void __cpuinit r4k_blast_icache_setup(void) { @@ -1377,20 +1380,6 @@ static void __cpuinit coherency_setup(void) } } -#if defined(CONFIG_DMA_NONCOHERENT) - -static int __cpuinitdata coherentio; - -static int __init setcoherentio(char *str) -{ - coherentio = 1; - - return 0; -} - -early_param("coherentio", setcoherentio); -#endif - static void __cpuinit r4k_cache_error_setup(void) { extern char __weak except_vec2_generic; @@ -1472,9 +1461,14 @@ void __cpuinit r4k_cache_init(void) build_clear_page(); build_copy_page(); -#if !defined(CONFIG_MIPS_CMP) + + /* + * We want to run CMP kernels on core with and without coherent + * caches. Therefore, do not use CONFIG_MIPS_CMP to decide whether + * or not to flush caches. + */ local_r4k___flush_cache_all(NULL); -#endif + coherency_setup(); board_cache_error_setup = r4k_cache_error_setup; } diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c index 07cec4407b0c..5aeb3eb0b72f 100644 --- a/arch/mips/mm/cache.c +++ b/arch/mips/mm/cache.c @@ -48,6 +48,7 @@ void (*flush_icache_all)(void); EXPORT_SYMBOL_GPL(local_flush_data_cache_page); EXPORT_SYMBOL(flush_data_cache_page); +EXPORT_SYMBOL(flush_icache_all); #ifdef CONFIG_DMA_NONCOHERENT diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c index f9ef83829a52..caf92ecb37d6 100644 --- a/arch/mips/mm/dma-default.c +++ b/arch/mips/mm/dma-default.c @@ -22,6 +22,26 @@ #include +int coherentio = 0; /* User defined DMA coherency from command line. */ +EXPORT_SYMBOL_GPL(coherentio); +int hw_coherentio = 0; /* Actual hardware supported DMA coherency setting. */ + +static int __init setcoherentio(char *str) +{ + coherentio = 1; + pr_info("Hardware DMA cache coherency (command line)\n"); + return 0; +} +early_param("coherentio", setcoherentio); + +static int __init setnocoherentio(char *str) +{ + coherentio = 0; + pr_info("Software DMA cache coherency (command line)\n"); + return 0; +} +early_param("nocoherentio", setnocoherentio); + static inline struct page *dma_addr_to_page(struct device *dev, dma_addr_t dma_addr) { @@ -115,7 +135,8 @@ static void *mips_dma_alloc_coherent(struct device *dev, size_t size, if (!plat_device_is_coherent(dev)) { dma_cache_wback_inv((unsigned long) ret, size); - ret = UNCAC_ADDR(ret); + if (!hw_coherentio) + ret = UNCAC_ADDR(ret); } } @@ -142,7 +163,7 @@ static void mips_dma_free_coherent(struct device *dev, size_t size, void *vaddr, plat_unmap_dma_mem(dev, dma_handle, size, DMA_BIDIRECTIONAL); - if (!plat_device_is_coherent(dev)) + if (!plat_device_is_coherent(dev) && !hw_coherentio) addr = CAC_ADDR(addr); free_pages(addr, get_order(size)); diff --git a/arch/mips/mm/page.c b/arch/mips/mm/page.c index a29fba55b53e..4eb8dcfaf1ce 100644 --- a/arch/mips/mm/page.c +++ b/arch/mips/mm/page.c @@ -247,6 +247,11 @@ void __cpuinit build_clear_page(void) struct uasm_label *l = labels; struct uasm_reloc *r = relocs; int i; + static atomic_t run_once = ATOMIC_INIT(0); + + if (atomic_xchg(&run_once, 1)) { + return; + } memset(labels, 0, sizeof(labels)); memset(relocs, 0, sizeof(relocs)); @@ -389,6 +394,11 @@ void __cpuinit build_copy_page(void) struct uasm_label *l = labels; struct uasm_reloc *r = relocs; int i; + static atomic_t run_once = ATOMIC_INIT(0); + + if (atomic_xchg(&run_once, 1)) { + return; + } memset(labels, 0, sizeof(labels)); memset(relocs, 0, sizeof(relocs)); diff --git a/arch/mips/mm/tlb-r3k.c b/arch/mips/mm/tlb-r3k.c index a63d1ed0827f..4a13c150f31b 100644 --- a/arch/mips/mm/tlb-r3k.c +++ b/arch/mips/mm/tlb-r3k.c @@ -51,7 +51,7 @@ void local_flush_tlb_all(void) #endif local_irq_save(flags); - old_ctx = read_c0_entryhi() & ASID_MASK; + old_ctx = ASID_MASK(read_c0_entryhi()); write_c0_entrylo0(0); entry = r3k_have_wired_reg ? read_c0_wired() : 8; for (; entry < current_cpu_data.tlbsize; entry++) { @@ -87,13 +87,13 @@ void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, #ifdef DEBUG_TLB printk("[tlbrange<%lu,0x%08lx,0x%08lx>]", - cpu_context(cpu, mm) & ASID_MASK, start, end); + ASID_MASK(cpu_context(cpu, mm)), start, end); #endif local_irq_save(flags); size = (end - start + (PAGE_SIZE - 1)) >> PAGE_SHIFT; if (size <= current_cpu_data.tlbsize) { - int oldpid = read_c0_entryhi() & ASID_MASK; - int newpid = cpu_context(cpu, mm) & ASID_MASK; + int oldpid = ASID_MASK(read_c0_entryhi()); + int newpid = ASID_MASK(cpu_context(cpu, mm)); start &= PAGE_MASK; end += PAGE_SIZE - 1; @@ -166,10 +166,10 @@ void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long page) #ifdef DEBUG_TLB printk("[tlbpage<%lu,0x%08lx>]", cpu_context(cpu, vma->vm_mm), page); #endif - newpid = cpu_context(cpu, vma->vm_mm) & ASID_MASK; + newpid = ASID_MASK(cpu_context(cpu, vma->vm_mm)); page &= PAGE_MASK; local_irq_save(flags); - oldpid = read_c0_entryhi() & ASID_MASK; + oldpid = ASID_MASK(read_c0_entryhi()); write_c0_entryhi(page | newpid); BARRIER; tlb_probe(); @@ -197,10 +197,10 @@ void __update_tlb(struct vm_area_struct *vma, unsigned long address, pte_t pte) if (current->active_mm != vma->vm_mm) return; - pid = read_c0_entryhi() & ASID_MASK; + pid = ASID_MASK(read_c0_entryhi()); #ifdef DEBUG_TLB - if ((pid != (cpu_context(cpu, vma->vm_mm) & ASID_MASK)) || (cpu_context(cpu, vma->vm_mm) == 0)) { + if ((pid != ASID_MASK(cpu_context(cpu, vma->vm_mm))) || (cpu_context(cpu, vma->vm_mm) == 0)) { printk("update_mmu_cache: Wheee, bogus tlbpid mmpid=%lu tlbpid=%d\n", (cpu_context(cpu, vma->vm_mm)), pid); } @@ -241,7 +241,7 @@ void add_wired_entry(unsigned long entrylo0, unsigned long entrylo1, local_irq_save(flags); /* Save old context and create impossible VPN2 value */ - old_ctx = read_c0_entryhi() & ASID_MASK; + old_ctx = ASID_MASK(read_c0_entryhi()); old_pagemask = read_c0_pagemask(); w = read_c0_wired(); write_c0_wired(w + 1); @@ -264,7 +264,7 @@ void add_wired_entry(unsigned long entrylo0, unsigned long entrylo1, #endif local_irq_save(flags); - old_ctx = read_c0_entryhi() & ASID_MASK; + old_ctx = ASID_MASK(read_c0_entryhi()); write_c0_entrylo0(entrylo0); write_c0_entryhi(entryhi); write_c0_index(wired); diff --git a/arch/mips/mm/tlb-r4k.c b/arch/mips/mm/tlb-r4k.c index 493131c81a29..09653b290d53 100644 --- a/arch/mips/mm/tlb-r4k.c +++ b/arch/mips/mm/tlb-r4k.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -94,6 +95,7 @@ void local_flush_tlb_all(void) FLUSH_ITLB; EXIT_CRITICAL(flags); } +EXPORT_SYMBOL(local_flush_tlb_all); /* All entries common to a mm share an asid. To effectively flush these entries, we just bump the asid. */ @@ -285,7 +287,7 @@ void __update_tlb(struct vm_area_struct * vma, unsigned long address, pte_t pte) ENTER_CRITICAL(flags); - pid = read_c0_entryhi() & ASID_MASK; + pid = ASID_MASK(read_c0_entryhi()); address &= (PAGE_MASK << 1); write_c0_entryhi(address | pid); pgdp = pgd_offset(vma->vm_mm, address); diff --git a/arch/mips/mm/tlb-r8k.c b/arch/mips/mm/tlb-r8k.c index 91c2499f806a..122f9207f49e 100644 --- a/arch/mips/mm/tlb-r8k.c +++ b/arch/mips/mm/tlb-r8k.c @@ -195,7 +195,7 @@ void __update_tlb(struct vm_area_struct * vma, unsigned long address, pte_t pte) if (current->active_mm != vma->vm_mm) return; - pid = read_c0_entryhi() & ASID_MASK; + pid = ASID_MASK(read_c0_entryhi()); local_irq_save(flags); address &= PAGE_MASK; diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c index 820e6612d744..4d46d3787576 100644 --- a/arch/mips/mm/tlbex.c +++ b/arch/mips/mm/tlbex.c @@ -29,6 +29,7 @@ #include #include +#include #include #include #include @@ -305,6 +306,78 @@ static struct uasm_reloc relocs[128] __cpuinitdata; static int check_for_high_segbits __cpuinitdata; #endif +static void __cpuinit insn_fixup(unsigned int **start, unsigned int **stop, + unsigned int i_const) +{ + unsigned int **p; + + for (p = start; p < stop; p++) { +#ifndef CONFIG_CPU_MICROMIPS + unsigned int *ip; + + ip = *p; + *ip = (*ip & 0xffff0000) | i_const; +#else + unsigned short *ip; + + ip = ((unsigned short *)((unsigned int)*p - 1)); + if ((*ip & 0xf000) == 0x4000) { + *ip &= 0xfff1; + *ip |= (i_const << 1); + } else if ((*ip & 0xf000) == 0x6000) { + *ip &= 0xfff1; + *ip |= ((i_const >> 2) << 1); + } else { + ip++; + *ip = i_const; + } +#endif + local_flush_icache_range((unsigned long)ip, + (unsigned long)ip + sizeof(*ip)); + } +} + +#define asid_insn_fixup(section, const) \ +do { \ + extern unsigned int *__start_ ## section; \ + extern unsigned int *__stop_ ## section; \ + insn_fixup(&__start_ ## section, &__stop_ ## section, const); \ +} while(0) + +/* + * Caller is assumed to flush the caches before the first context switch. + */ +static void __cpuinit setup_asid(unsigned int inc, unsigned int mask, + unsigned int version_mask, + unsigned int first_version) +{ + extern asmlinkage void handle_ri_rdhwr_vivt(void); + unsigned long *vivt_exc; + +#ifdef CONFIG_CPU_MICROMIPS + /* + * Worst case optimised microMIPS addiu instructions support + * only a 3-bit immediate value. + */ + if(inc > 7) + panic("Invalid ASID increment value!"); +#endif + asid_insn_fixup(__asid_inc, inc); + asid_insn_fixup(__asid_mask, mask); + asid_insn_fixup(__asid_version_mask, version_mask); + asid_insn_fixup(__asid_first_version, first_version); + + /* Patch up the 'handle_ri_rdhwr_vivt' handler. */ + vivt_exc = (unsigned long *) &handle_ri_rdhwr_vivt; +#ifdef CONFIG_CPU_MICROMIPS + vivt_exc = (unsigned long *)((unsigned long) vivt_exc - 1); +#endif + vivt_exc++; + *vivt_exc = (*vivt_exc & ~mask) | mask; + + current_cpu_data.asid_cache = first_version; +} + static int check_for_high_segbits __cpuinitdata; static unsigned int kscratch_used_mask __cpuinitdata; @@ -1458,17 +1531,17 @@ u32 handle_tlbl[FASTPATH_SIZE] __cacheline_aligned; u32 handle_tlbs[FASTPATH_SIZE] __cacheline_aligned; u32 handle_tlbm[FASTPATH_SIZE] __cacheline_aligned; #ifdef CONFIG_MIPS_PGD_C0_CONTEXT -u32 tlbmiss_handler_setup_pgd[16] __cacheline_aligned; +u32 tlbmiss_handler_setup_pgd_array[16] __cacheline_aligned; static void __cpuinit build_r4000_setup_pgd(void) { const int a0 = 4; const int a1 = 5; - u32 *p = tlbmiss_handler_setup_pgd; + u32 *p = tlbmiss_handler_setup_pgd_array; struct uasm_label *l = labels; struct uasm_reloc *r = relocs; - memset(tlbmiss_handler_setup_pgd, 0, sizeof(tlbmiss_handler_setup_pgd)); + memset(tlbmiss_handler_setup_pgd_array, 0, sizeof(tlbmiss_handler_setup_pgd_array)); memset(labels, 0, sizeof(labels)); memset(relocs, 0, sizeof(relocs)); @@ -1496,15 +1569,15 @@ static void __cpuinit build_r4000_setup_pgd(void) uasm_i_jr(&p, 31); UASM_i_MTC0(&p, a0, 31, pgd_reg); } - if (p - tlbmiss_handler_setup_pgd > ARRAY_SIZE(tlbmiss_handler_setup_pgd)) - panic("tlbmiss_handler_setup_pgd space exceeded"); + if (p - tlbmiss_handler_setup_pgd_array > ARRAY_SIZE(tlbmiss_handler_setup_pgd_array)) + panic("tlbmiss_handler_setup_pgd_array space exceeded"); uasm_resolve_relocs(relocs, labels); - pr_debug("Wrote tlbmiss_handler_setup_pgd (%u instructions).\n", - (unsigned int)(p - tlbmiss_handler_setup_pgd)); + pr_debug("Wrote tlbmiss_handler_setup_pgd_array (%u instructions).\n", + (unsigned int)(p - tlbmiss_handler_setup_pgd_array)); dump_handler("tlbmiss_handler", - tlbmiss_handler_setup_pgd, - ARRAY_SIZE(tlbmiss_handler_setup_pgd)); + tlbmiss_handler_setup_pgd_array, + ARRAY_SIZE(tlbmiss_handler_setup_pgd_array)); } #endif @@ -2030,6 +2103,13 @@ static void __cpuinit build_r4000_tlb_load_handler(void) uasm_l_nopage_tlbl(&l, p); build_restore_work_registers(&p); +#ifdef CONFIG_CPU_MICROMIPS + if ((unsigned long)tlb_do_page_fault_0 & 1) { + uasm_i_lui(&p, K0, uasm_rel_hi((long)tlb_do_page_fault_0)); + uasm_i_addiu(&p, K0, K0, uasm_rel_lo((long)tlb_do_page_fault_0)); + uasm_i_jr(&p, K0); + } else +#endif uasm_i_j(&p, (unsigned long)tlb_do_page_fault_0 & 0x0fffffff); uasm_i_nop(&p); @@ -2077,6 +2157,13 @@ static void __cpuinit build_r4000_tlb_store_handler(void) uasm_l_nopage_tlbs(&l, p); build_restore_work_registers(&p); +#ifdef CONFIG_CPU_MICROMIPS + if ((unsigned long)tlb_do_page_fault_1 & 1) { + uasm_i_lui(&p, K0, uasm_rel_hi((long)tlb_do_page_fault_1)); + uasm_i_addiu(&p, K0, K0, uasm_rel_lo((long)tlb_do_page_fault_1)); + uasm_i_jr(&p, K0); + } else +#endif uasm_i_j(&p, (unsigned long)tlb_do_page_fault_1 & 0x0fffffff); uasm_i_nop(&p); @@ -2125,6 +2212,13 @@ static void __cpuinit build_r4000_tlb_modify_handler(void) uasm_l_nopage_tlbm(&l, p); build_restore_work_registers(&p); +#ifdef CONFIG_CPU_MICROMIPS + if ((unsigned long)tlb_do_page_fault_1 & 1) { + uasm_i_lui(&p, K0, uasm_rel_hi((long)tlb_do_page_fault_1)); + uasm_i_addiu(&p, K0, K0, uasm_rel_lo((long)tlb_do_page_fault_1)); + uasm_i_jr(&p, K0); + } else +#endif uasm_i_j(&p, (unsigned long)tlb_do_page_fault_1 & 0x0fffffff); uasm_i_nop(&p); @@ -2162,8 +2256,12 @@ void __cpuinit build_tlb_refill_handler(void) case CPU_TX3922: case CPU_TX3927: #ifndef CONFIG_MIPS_PGD_C0_CONTEXT - build_r3000_tlb_refill_handler(); + setup_asid(0x40, 0xfc0, 0xf000, ASID_FIRST_VERSION_R3000); + if (cpu_has_local_ebase) + build_r3000_tlb_refill_handler(); if (!run_once) { + if (!cpu_has_local_ebase) + build_r3000_tlb_refill_handler(); build_r3000_tlb_load_handler(); build_r3000_tlb_store_handler(); build_r3000_tlb_modify_handler(); @@ -2184,6 +2282,11 @@ void __cpuinit build_tlb_refill_handler(void) break; default: +#ifndef CONFIG_MIPS_MT_SMTC + setup_asid(0x1, 0xff, 0xff00, ASID_FIRST_VERSION_R4000); +#else + setup_asid(0x1, smtc_asid_mask, 0xff00, ASID_FIRST_VERSION_R4000); +#endif if (!run_once) { scratch_reg = allocate_kscratch(); #ifdef CONFIG_MIPS_PGD_C0_CONTEXT @@ -2192,9 +2295,12 @@ void __cpuinit build_tlb_refill_handler(void) build_r4000_tlb_load_handler(); build_r4000_tlb_store_handler(); build_r4000_tlb_modify_handler(); + if (!cpu_has_local_ebase) + build_r4000_tlb_refill_handler(); run_once++; } - build_r4000_tlb_refill_handler(); + if (cpu_has_local_ebase) + build_r4000_tlb_refill_handler(); } } @@ -2207,7 +2313,7 @@ void __cpuinit flush_tlb_handlers(void) local_flush_icache_range((unsigned long)handle_tlbm, (unsigned long)handle_tlbm + sizeof(handle_tlbm)); #ifdef CONFIG_MIPS_PGD_C0_CONTEXT - local_flush_icache_range((unsigned long)tlbmiss_handler_setup_pgd, - (unsigned long)tlbmiss_handler_setup_pgd + sizeof(handle_tlbm)); + local_flush_icache_range((unsigned long)tlbmiss_handler_setup_pgd_array, + (unsigned long)tlbmiss_handler_setup_pgd_array + sizeof(handle_tlbm)); #endif } diff --git a/arch/mips/mm/uasm-micromips.c b/arch/mips/mm/uasm-micromips.c new file mode 100644 index 000000000000..162ee6d62788 --- /dev/null +++ b/arch/mips/mm/uasm-micromips.c @@ -0,0 +1,221 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * A small micro-assembler. It is intentionally kept simple, does only + * support a subset of instructions, and does not try to hide pipeline + * effects like branch delay slots. + * + * Copyright (C) 2004, 2005, 2006, 2008 Thiemo Seufer + * Copyright (C) 2005, 2007 Maciej W. Rozycki + * Copyright (C) 2006 Ralf Baechle (ralf@linux-mips.org) + * Copyright (C) 2012, 2013 MIPS Technologies, Inc. All rights reserved. + */ + +#include +#include +#include + +#include +#include +#include +#define UASM_ISA _UASM_ISA_MICROMIPS +#include + +#define RS_MASK 0x1f +#define RS_SH 16 +#define RT_MASK 0x1f +#define RT_SH 21 +#define SCIMM_MASK 0x3ff +#define SCIMM_SH 16 + +/* This macro sets the non-variable bits of an instruction. */ +#define M(a, b, c, d, e, f) \ + ((a) << OP_SH \ + | (b) << RT_SH \ + | (c) << RS_SH \ + | (d) << RD_SH \ + | (e) << RE_SH \ + | (f) << FUNC_SH) + +/* Define these when we are not the ISA the kernel is being compiled with. */ +#ifndef CONFIG_CPU_MICROMIPS +#define MM_uasm_i_b(buf, off) ISAOPC(_beq)(buf, 0, 0, off) +#define MM_uasm_i_beqz(buf, rs, off) ISAOPC(_beq)(buf, rs, 0, off) +#define MM_uasm_i_beqzl(buf, rs, off) ISAOPC(_beql)(buf, rs, 0, off) +#define MM_uasm_i_bnez(buf, rs, off) ISAOPC(_bne)(buf, rs, 0, off) +#endif + +#include "uasm.c" + +static struct insn insn_table_MM[] __uasminitdata = { + { insn_addu, M(mm_pool32a_op, 0, 0, 0, 0, mm_addu32_op), RT | RS | RD }, + { insn_addiu, M(mm_addiu32_op, 0, 0, 0, 0, 0), RT | RS | SIMM }, + { insn_and, M(mm_pool32a_op, 0, 0, 0, 0, mm_and_op), RT | RS | RD }, + { insn_andi, M(mm_andi32_op, 0, 0, 0, 0, 0), RT | RS | UIMM }, + { insn_beq, M(mm_beq32_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, + { insn_beql, 0, 0 }, + { insn_bgez, M(mm_pool32i_op, mm_bgez_op, 0, 0, 0, 0), RS | BIMM }, + { insn_bgezl, 0, 0 }, + { insn_bltz, M(mm_pool32i_op, mm_bltz_op, 0, 0, 0, 0), RS | BIMM }, + { insn_bltzl, 0, 0 }, + { insn_bne, M(mm_bne32_op, 0, 0, 0, 0, 0), RT | RS | BIMM }, + { insn_cache, M(mm_pool32b_op, 0, 0, mm_cache_func, 0, 0), RT | RS | SIMM }, + { insn_daddu, 0, 0 }, + { insn_daddiu, 0, 0 }, + { insn_dmfc0, 0, 0 }, + { insn_dmtc0, 0, 0 }, + { insn_dsll, 0, 0 }, + { insn_dsll32, 0, 0 }, + { insn_dsra, 0, 0 }, + { insn_dsrl, 0, 0 }, + { insn_dsrl32, 0, 0 }, + { insn_drotr, 0, 0 }, + { insn_drotr32, 0, 0 }, + { insn_dsubu, 0, 0 }, + { insn_eret, M(mm_pool32a_op, 0, 0, 0, mm_eret_op, mm_pool32axf_op), 0 }, + { insn_ins, M(mm_pool32a_op, 0, 0, 0, 0, mm_ins_op), RT | RS | RD | RE }, + { insn_ext, M(mm_pool32a_op, 0, 0, 0, 0, mm_ext_op), RT | RS | RD | RE }, + { insn_j, M(mm_j32_op, 0, 0, 0, 0, 0), JIMM }, + { insn_jal, M(mm_jal32_op, 0, 0, 0, 0, 0), JIMM }, + { insn_jr, M(mm_pool32a_op, 0, 0, 0, mm_jalr_op, mm_pool32axf_op), RS }, + { insn_ld, 0, 0 }, + { insn_ll, M(mm_pool32c_op, 0, 0, (mm_ll_func << 1), 0, 0), RS | RT | SIMM }, + { insn_lld, 0, 0 }, + { insn_lui, M(mm_pool32i_op, mm_lui_op, 0, 0, 0, 0), RS | SIMM }, + { insn_lw, M(mm_lw32_op, 0, 0, 0, 0, 0), RT | RS | SIMM }, + { insn_mfc0, M(mm_pool32a_op, 0, 0, 0, mm_mfc0_op, mm_pool32axf_op), RT | RS | RD }, + { insn_mtc0, M(mm_pool32a_op, 0, 0, 0, mm_mtc0_op, mm_pool32axf_op), RT | RS | RD }, + { insn_or, M(mm_pool32a_op, 0, 0, 0, 0, mm_or32_op), RT | RS | RD }, + { insn_ori, M(mm_ori32_op, 0, 0, 0, 0, 0), RT | RS | UIMM }, + { insn_pref, M(mm_pool32c_op, 0, 0, (mm_pref_func << 1), 0, 0), RT | RS | SIMM }, + { insn_rfe, 0, 0 }, + { insn_sc, M(mm_pool32c_op, 0, 0, (mm_sc_func << 1), 0, 0), RT | RS | SIMM }, + { insn_scd, 0, 0 }, + { insn_sd, 0, 0 }, + { insn_sll, M(mm_pool32a_op, 0, 0, 0, 0, mm_sll32_op), RT | RS | RD }, + { insn_sra, M(mm_pool32a_op, 0, 0, 0, 0, mm_sra_op), RT | RS | RD }, + { insn_srl, M(mm_pool32a_op, 0, 0, 0, 0, mm_srl32_op), RT | RS | RD }, + { insn_rotr, M(mm_pool32a_op, 0, 0, 0, 0, mm_rotr_op), RT | RS | RD }, + { insn_subu, M(mm_pool32a_op, 0, 0, 0, 0, mm_subu32_op), RT | RS | RD }, + { insn_sw, M(mm_sw32_op, 0, 0, 0, 0, 0), RT | RS | SIMM }, + { insn_tlbp, M(mm_pool32a_op, 0, 0, 0, mm_tlbp_op, mm_pool32axf_op), 0 }, + { insn_tlbr, M(mm_pool32a_op, 0, 0, 0, mm_tlbr_op, mm_pool32axf_op), 0 }, + { insn_tlbwi, M(mm_pool32a_op, 0, 0, 0, mm_tlbwi_op, mm_pool32axf_op), 0 }, + { insn_tlbwr, M(mm_pool32a_op, 0, 0, 0, mm_tlbwr_op, mm_pool32axf_op), 0 }, + { insn_xor, M(mm_pool32a_op, 0, 0, 0, 0, mm_xor32_op), RT | RS | RD }, + { insn_xori, M(mm_xori32_op, 0, 0, 0, 0, 0), RT | RS | UIMM }, + { insn_dins, 0, 0 }, + { insn_dinsm, 0, 0 }, + { insn_syscall, M(mm_pool32a_op, 0, 0, 0, mm_syscall_op, mm_pool32axf_op), SCIMM}, + { insn_bbit0, 0, 0 }, + { insn_bbit1, 0, 0 }, + { insn_lwx, 0, 0 }, + { insn_ldx, 0, 0 }, + { insn_invalid, 0, 0 } +}; + +#undef M + +static inline __uasminit u32 build_bimm(s32 arg) +{ + WARN(arg > 0xffff || arg < -0x10000, + KERN_WARNING "Micro-assembler field overflow\n"); + + WARN(arg & 0x3, KERN_WARNING "Invalid micro-assembler branch target\n"); + + return ((arg < 0) ? (1 << 15) : 0) | ((arg >> 1) & 0x7fff); +} + +static inline __uasminit u32 build_jimm(u32 arg) +{ + + WARN(arg & ~((JIMM_MASK << 2) | 1), + KERN_WARNING "Micro-assembler field overflow\n"); + + return (arg >> 1) & JIMM_MASK; +} + +/* + * The order of opcode arguments is implicitly left to right, + * starting with RS and ending with FUNC or IMM. + */ +static void __uasminit build_insn(u32 **buf, enum opcode opc, ...) +{ + struct insn *ip = NULL; + unsigned int i; + va_list ap; + u32 op; + + for (i = 0; insn_table_MM[i].opcode != insn_invalid; i++) + if (insn_table_MM[i].opcode == opc) { + ip = &insn_table_MM[i]; + break; + } + + if (!ip || (opc == insn_daddiu && r4k_daddiu_bug())) + panic("Unsupported Micro-assembler instruction %d", opc); + + op = ip->match; + va_start(ap, opc); + if (ip->fields & RS) { + if (opc == insn_mfc0 || opc == insn_mtc0) + op |= build_rt(va_arg(ap, u32)); + else + op |= build_rs(va_arg(ap, u32)); + } + if (ip->fields & RT) { + if (opc == insn_mfc0 || opc == insn_mtc0) + op |= build_rs(va_arg(ap, u32)); + else + op |= build_rt(va_arg(ap, u32)); + } + if (ip->fields & RD) + op |= build_rd(va_arg(ap, u32)); + if (ip->fields & RE) + op |= build_re(va_arg(ap, u32)); + if (ip->fields & SIMM) + op |= build_simm(va_arg(ap, s32)); + if (ip->fields & UIMM) + op |= build_uimm(va_arg(ap, u32)); + if (ip->fields & BIMM) + op |= build_bimm(va_arg(ap, s32)); + if (ip->fields & JIMM) + op |= build_jimm(va_arg(ap, u32)); + if (ip->fields & FUNC) + op |= build_func(va_arg(ap, u32)); + if (ip->fields & SET) + op |= build_set(va_arg(ap, u32)); + if (ip->fields & SCIMM) + op |= build_scimm(va_arg(ap, u32)); + va_end(ap); + +#ifdef CONFIG_CPU_LITTLE_ENDIAN + **buf = ((op & 0xffff) << 16) | (op >> 16); +#else + **buf = op; +#endif + (*buf)++; +} + +static inline void __uasminit +__resolve_relocs(struct uasm_reloc *rel, struct uasm_label *lab) +{ + long laddr = (long)lab->addr; + long raddr = (long)rel->addr; + + switch (rel->type) { + case R_MIPS_PC16: +#ifdef CONFIG_CPU_LITTLE_ENDIAN + *rel->addr |= (build_bimm(laddr - (raddr + 4)) << 16); +#else + *rel->addr |= build_bimm(laddr - (raddr + 4)); +#endif + break; + + default: + panic("Unsupported Micro-assembler relocation %d", + rel->type); + } +} diff --git a/arch/mips/mm/uasm-mips.c b/arch/mips/mm/uasm-mips.c new file mode 100644 index 000000000000..5fcdd8fe3e83 --- /dev/null +++ b/arch/mips/mm/uasm-mips.c @@ -0,0 +1,205 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * A small micro-assembler. It is intentionally kept simple, does only + * support a subset of instructions, and does not try to hide pipeline + * effects like branch delay slots. + * + * Copyright (C) 2004, 2005, 2006, 2008 Thiemo Seufer + * Copyright (C) 2005, 2007 Maciej W. Rozycki + * Copyright (C) 2006 Ralf Baechle (ralf@linux-mips.org) + * Copyright (C) 2012, 2013 MIPS Technologies, Inc. All rights reserved. + */ + +#include +#include +#include + +#include +#include +#include +#define UASM_ISA _UASM_ISA_CLASSIC +#include + +#define RS_MASK 0x1f +#define RS_SH 21 +#define RT_MASK 0x1f +#define RT_SH 16 +#define SCIMM_MASK 0xfffff +#define SCIMM_SH 6 + +/* This macro sets the non-variable bits of an instruction. */ +#define M(a, b, c, d, e, f) \ + ((a) << OP_SH \ + | (b) << RS_SH \ + | (c) << RT_SH \ + | (d) << RD_SH \ + | (e) << RE_SH \ + | (f) << FUNC_SH) + +/* Define these when we are not the ISA the kernel is being compiled with. */ +#ifdef CONFIG_CPU_MICROMIPS +#define CL_uasm_i_b(buf, off) ISAOPC(_beq)(buf, 0, 0, off) +#define CL_uasm_i_beqz(buf, rs, off) ISAOPC(_beq)(buf, rs, 0, off) +#define CL_uasm_i_beqzl(buf, rs, off) ISAOPC(_beql)(buf, rs, 0, off) +#define CL_uasm_i_bnez(buf, rs, off) ISAOPC(_bne)(buf, rs, 0, off) +#endif + +#include "uasm.c" + +static struct insn insn_table[] __uasminitdata = { + { insn_addiu, M(addiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_addu, M(spec_op, 0, 0, 0, 0, addu_op), RS | RT | RD }, + { insn_andi, M(andi_op, 0, 0, 0, 0, 0), RS | RT | UIMM }, + { insn_and, M(spec_op, 0, 0, 0, 0, and_op), RS | RT | RD }, + { insn_bbit0, M(lwc2_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, + { insn_bbit1, M(swc2_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, + { insn_beql, M(beql_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, + { insn_beq, M(beq_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, + { insn_bgezl, M(bcond_op, 0, bgezl_op, 0, 0, 0), RS | BIMM }, + { insn_bgez, M(bcond_op, 0, bgez_op, 0, 0, 0), RS | BIMM }, + { insn_bltzl, M(bcond_op, 0, bltzl_op, 0, 0, 0), RS | BIMM }, + { insn_bltz, M(bcond_op, 0, bltz_op, 0, 0, 0), RS | BIMM }, + { insn_bne, M(bne_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, + { insn_cache, M(cache_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_daddiu, M(daddiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_daddu, M(spec_op, 0, 0, 0, 0, daddu_op), RS | RT | RD }, + { insn_dinsm, M(spec3_op, 0, 0, 0, 0, dinsm_op), RS | RT | RD | RE }, + { insn_dins, M(spec3_op, 0, 0, 0, 0, dins_op), RS | RT | RD | RE }, + { insn_dmfc0, M(cop0_op, dmfc_op, 0, 0, 0, 0), RT | RD | SET}, + { insn_dmtc0, M(cop0_op, dmtc_op, 0, 0, 0, 0), RT | RD | SET}, + { insn_drotr32, M(spec_op, 1, 0, 0, 0, dsrl32_op), RT | RD | RE }, + { insn_drotr, M(spec_op, 1, 0, 0, 0, dsrl_op), RT | RD | RE }, + { insn_dsll32, M(spec_op, 0, 0, 0, 0, dsll32_op), RT | RD | RE }, + { insn_dsll, M(spec_op, 0, 0, 0, 0, dsll_op), RT | RD | RE }, + { insn_dsra, M(spec_op, 0, 0, 0, 0, dsra_op), RT | RD | RE }, + { insn_dsrl32, M(spec_op, 0, 0, 0, 0, dsrl32_op), RT | RD | RE }, + { insn_dsrl, M(spec_op, 0, 0, 0, 0, dsrl_op), RT | RD | RE }, + { insn_dsubu, M(spec_op, 0, 0, 0, 0, dsubu_op), RS | RT | RD }, + { insn_eret, M(cop0_op, cop_op, 0, 0, 0, eret_op), 0 }, + { insn_ext, M(spec3_op, 0, 0, 0, 0, ext_op), RS | RT | RD | RE }, + { insn_ins, M(spec3_op, 0, 0, 0, 0, ins_op), RS | RT | RD | RE }, + { insn_j, M(j_op, 0, 0, 0, 0, 0), JIMM }, + { insn_jal, M(jal_op, 0, 0, 0, 0, 0), JIMM }, + { insn_j, M(j_op, 0, 0, 0, 0, 0), JIMM }, + { insn_jr, M(spec_op, 0, 0, 0, 0, jr_op), RS }, + { insn_ld, M(ld_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_ldx, M(spec3_op, 0, 0, 0, ldx_op, lx_op), RS | RT | RD }, + { insn_lld, M(lld_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_ll, M(ll_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_lui, M(lui_op, 0, 0, 0, 0, 0), RT | SIMM }, + { insn_lw, M(lw_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_lwx, M(spec3_op, 0, 0, 0, lwx_op, lx_op), RS | RT | RD }, + { insn_mfc0, M(cop0_op, mfc_op, 0, 0, 0, 0), RT | RD | SET}, + { insn_mtc0, M(cop0_op, mtc_op, 0, 0, 0, 0), RT | RD | SET}, + { insn_ori, M(ori_op, 0, 0, 0, 0, 0), RS | RT | UIMM }, + { insn_or, M(spec_op, 0, 0, 0, 0, or_op), RS | RT | RD }, + { insn_pref, M(pref_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_rfe, M(cop0_op, cop_op, 0, 0, 0, rfe_op), 0 }, + { insn_rotr, M(spec_op, 1, 0, 0, 0, srl_op), RT | RD | RE }, + { insn_scd, M(scd_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_sc, M(sc_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_sd, M(sd_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_sll, M(spec_op, 0, 0, 0, 0, sll_op), RT | RD | RE }, + { insn_sra, M(spec_op, 0, 0, 0, 0, sra_op), RT | RD | RE }, + { insn_srl, M(spec_op, 0, 0, 0, 0, srl_op), RT | RD | RE }, + { insn_subu, M(spec_op, 0, 0, 0, 0, subu_op), RS | RT | RD }, + { insn_sw, M(sw_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, + { insn_syscall, M(spec_op, 0, 0, 0, 0, syscall_op), SCIMM}, + { insn_tlbp, M(cop0_op, cop_op, 0, 0, 0, tlbp_op), 0 }, + { insn_tlbr, M(cop0_op, cop_op, 0, 0, 0, tlbr_op), 0 }, + { insn_tlbwi, M(cop0_op, cop_op, 0, 0, 0, tlbwi_op), 0 }, + { insn_tlbwr, M(cop0_op, cop_op, 0, 0, 0, tlbwr_op), 0 }, + { insn_xori, M(xori_op, 0, 0, 0, 0, 0), RS | RT | UIMM }, + { insn_xor, M(spec_op, 0, 0, 0, 0, xor_op), RS | RT | RD }, + { insn_invalid, 0, 0 } +}; + +#undef M + +static inline __uasminit u32 build_bimm(s32 arg) +{ + WARN(arg > 0x1ffff || arg < -0x20000, + KERN_WARNING "Micro-assembler field overflow\n"); + + WARN(arg & 0x3, KERN_WARNING "Invalid micro-assembler branch target\n"); + + return ((arg < 0) ? (1 << 15) : 0) | ((arg >> 2) & 0x7fff); +} + +static inline __uasminit u32 build_jimm(u32 arg) +{ + WARN(arg & ~(JIMM_MASK << 2), + KERN_WARNING "Micro-assembler field overflow\n"); + + return (arg >> 2) & JIMM_MASK; +} + +/* + * The order of opcode arguments is implicitly left to right, + * starting with RS and ending with FUNC or IMM. + */ +static void __uasminit build_insn(u32 **buf, enum opcode opc, ...) +{ + struct insn *ip = NULL; + unsigned int i; + va_list ap; + u32 op; + + for (i = 0; insn_table[i].opcode != insn_invalid; i++) + if (insn_table[i].opcode == opc) { + ip = &insn_table[i]; + break; + } + + if (!ip || (opc == insn_daddiu && r4k_daddiu_bug())) + panic("Unsupported Micro-assembler instruction %d", opc); + + op = ip->match; + va_start(ap, opc); + if (ip->fields & RS) + op |= build_rs(va_arg(ap, u32)); + if (ip->fields & RT) + op |= build_rt(va_arg(ap, u32)); + if (ip->fields & RD) + op |= build_rd(va_arg(ap, u32)); + if (ip->fields & RE) + op |= build_re(va_arg(ap, u32)); + if (ip->fields & SIMM) + op |= build_simm(va_arg(ap, s32)); + if (ip->fields & UIMM) + op |= build_uimm(va_arg(ap, u32)); + if (ip->fields & BIMM) + op |= build_bimm(va_arg(ap, s32)); + if (ip->fields & JIMM) + op |= build_jimm(va_arg(ap, u32)); + if (ip->fields & FUNC) + op |= build_func(va_arg(ap, u32)); + if (ip->fields & SET) + op |= build_set(va_arg(ap, u32)); + if (ip->fields & SCIMM) + op |= build_scimm(va_arg(ap, u32)); + va_end(ap); + + **buf = op; + (*buf)++; +} + +static inline void __uasminit +__resolve_relocs(struct uasm_reloc *rel, struct uasm_label *lab) +{ + long laddr = (long)lab->addr; + long raddr = (long)rel->addr; + + switch (rel->type) { + case R_MIPS_PC16: + *rel->addr |= build_bimm(laddr - (raddr + 4)); + break; + + default: + panic("Unsupported Micro-assembler relocation %d", + rel->type); + } +} diff --git a/arch/mips/mm/uasm.c b/arch/mips/mm/uasm.c index 942ff6c2eba2..7eb5e4355d25 100644 --- a/arch/mips/mm/uasm.c +++ b/arch/mips/mm/uasm.c @@ -10,17 +10,9 @@ * Copyright (C) 2004, 2005, 2006, 2008 Thiemo Seufer * Copyright (C) 2005, 2007 Maciej W. Rozycki * Copyright (C) 2006 Ralf Baechle (ralf@linux-mips.org) + * Copyright (C) 2012, 2013 MIPS Technologies, Inc. All rights reserved. */ -#include -#include -#include - -#include -#include -#include -#include - enum fields { RS = 0x001, RT = 0x002, @@ -37,10 +29,6 @@ enum fields { #define OP_MASK 0x3f #define OP_SH 26 -#define RS_MASK 0x1f -#define RS_SH 21 -#define RT_MASK 0x1f -#define RT_SH 16 #define RD_MASK 0x1f #define RD_SH 11 #define RE_MASK 0x1f @@ -53,8 +41,6 @@ enum fields { #define FUNC_SH 0 #define SET_MASK 0x7 #define SET_SH 0 -#define SCIMM_MASK 0xfffff -#define SCIMM_SH 6 enum opcode { insn_invalid, @@ -77,85 +63,6 @@ struct insn { enum fields fields; }; -/* This macro sets the non-variable bits of an instruction. */ -#define M(a, b, c, d, e, f) \ - ((a) << OP_SH \ - | (b) << RS_SH \ - | (c) << RT_SH \ - | (d) << RD_SH \ - | (e) << RE_SH \ - | (f) << FUNC_SH) - -static struct insn insn_table[] __uasminitdata = { - { insn_addiu, M(addiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_addu, M(spec_op, 0, 0, 0, 0, addu_op), RS | RT | RD }, - { insn_andi, M(andi_op, 0, 0, 0, 0, 0), RS | RT | UIMM }, - { insn_and, M(spec_op, 0, 0, 0, 0, and_op), RS | RT | RD }, - { insn_bbit0, M(lwc2_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, - { insn_bbit1, M(swc2_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, - { insn_beql, M(beql_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, - { insn_beq, M(beq_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, - { insn_bgezl, M(bcond_op, 0, bgezl_op, 0, 0, 0), RS | BIMM }, - { insn_bgez, M(bcond_op, 0, bgez_op, 0, 0, 0), RS | BIMM }, - { insn_bltzl, M(bcond_op, 0, bltzl_op, 0, 0, 0), RS | BIMM }, - { insn_bltz, M(bcond_op, 0, bltz_op, 0, 0, 0), RS | BIMM }, - { insn_bne, M(bne_op, 0, 0, 0, 0, 0), RS | RT | BIMM }, - { insn_cache, M(cache_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_daddiu, M(daddiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_daddu, M(spec_op, 0, 0, 0, 0, daddu_op), RS | RT | RD }, - { insn_dinsm, M(spec3_op, 0, 0, 0, 0, dinsm_op), RS | RT | RD | RE }, - { insn_dins, M(spec3_op, 0, 0, 0, 0, dins_op), RS | RT | RD | RE }, - { insn_dmfc0, M(cop0_op, dmfc_op, 0, 0, 0, 0), RT | RD | SET}, - { insn_dmtc0, M(cop0_op, dmtc_op, 0, 0, 0, 0), RT | RD | SET}, - { insn_drotr32, M(spec_op, 1, 0, 0, 0, dsrl32_op), RT | RD | RE }, - { insn_drotr, M(spec_op, 1, 0, 0, 0, dsrl_op), RT | RD | RE }, - { insn_dsll32, M(spec_op, 0, 0, 0, 0, dsll32_op), RT | RD | RE }, - { insn_dsll, M(spec_op, 0, 0, 0, 0, dsll_op), RT | RD | RE }, - { insn_dsra, M(spec_op, 0, 0, 0, 0, dsra_op), RT | RD | RE }, - { insn_dsrl32, M(spec_op, 0, 0, 0, 0, dsrl32_op), RT | RD | RE }, - { insn_dsrl, M(spec_op, 0, 0, 0, 0, dsrl_op), RT | RD | RE }, - { insn_dsubu, M(spec_op, 0, 0, 0, 0, dsubu_op), RS | RT | RD }, - { insn_eret, M(cop0_op, cop_op, 0, 0, 0, eret_op), 0 }, - { insn_ext, M(spec3_op, 0, 0, 0, 0, ext_op), RS | RT | RD | RE }, - { insn_ins, M(spec3_op, 0, 0, 0, 0, ins_op), RS | RT | RD | RE }, - { insn_j, M(j_op, 0, 0, 0, 0, 0), JIMM }, - { insn_jal, M(jal_op, 0, 0, 0, 0, 0), JIMM }, - { insn_j, M(j_op, 0, 0, 0, 0, 0), JIMM }, - { insn_jr, M(spec_op, 0, 0, 0, 0, jr_op), RS }, - { insn_ld, M(ld_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_ldx, M(spec3_op, 0, 0, 0, ldx_op, lx_op), RS | RT | RD }, - { insn_lld, M(lld_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_ll, M(ll_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_lui, M(lui_op, 0, 0, 0, 0, 0), RT | SIMM }, - { insn_lw, M(lw_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_lwx, M(spec3_op, 0, 0, 0, lwx_op, lx_op), RS | RT | RD }, - { insn_mfc0, M(cop0_op, mfc_op, 0, 0, 0, 0), RT | RD | SET}, - { insn_mtc0, M(cop0_op, mtc_op, 0, 0, 0, 0), RT | RD | SET}, - { insn_ori, M(ori_op, 0, 0, 0, 0, 0), RS | RT | UIMM }, - { insn_or, M(spec_op, 0, 0, 0, 0, or_op), RS | RT | RD }, - { insn_pref, M(pref_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_rfe, M(cop0_op, cop_op, 0, 0, 0, rfe_op), 0 }, - { insn_rotr, M(spec_op, 1, 0, 0, 0, srl_op), RT | RD | RE }, - { insn_scd, M(scd_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_sc, M(sc_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_sd, M(sd_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_sll, M(spec_op, 0, 0, 0, 0, sll_op), RT | RD | RE }, - { insn_sra, M(spec_op, 0, 0, 0, 0, sra_op), RT | RD | RE }, - { insn_srl, M(spec_op, 0, 0, 0, 0, srl_op), RT | RD | RE }, - { insn_subu, M(spec_op, 0, 0, 0, 0, subu_op), RS | RT | RD }, - { insn_sw, M(sw_op, 0, 0, 0, 0, 0), RS | RT | SIMM }, - { insn_syscall, M(spec_op, 0, 0, 0, 0, syscall_op), SCIMM}, - { insn_tlbp, M(cop0_op, cop_op, 0, 0, 0, tlbp_op), 0 }, - { insn_tlbr, M(cop0_op, cop_op, 0, 0, 0, tlbr_op), 0 }, - { insn_tlbwi, M(cop0_op, cop_op, 0, 0, 0, tlbwi_op), 0 }, - { insn_tlbwr, M(cop0_op, cop_op, 0, 0, 0, tlbwr_op), 0 }, - { insn_xori, M(xori_op, 0, 0, 0, 0, 0), RS | RT | UIMM }, - { insn_xor, M(spec_op, 0, 0, 0, 0, xor_op), RS | RT | RD }, - { insn_invalid, 0, 0 } -}; - -#undef M - static inline __uasminit u32 build_rs(u32 arg) { WARN(arg & ~RS_MASK, KERN_WARNING "Micro-assembler field overflow\n"); @@ -199,24 +106,6 @@ static inline __uasminit u32 build_uimm(u32 arg) return arg & IMM_MASK; } -static inline __uasminit u32 build_bimm(s32 arg) -{ - WARN(arg > 0x1ffff || arg < -0x20000, - KERN_WARNING "Micro-assembler field overflow\n"); - - WARN(arg & 0x3, KERN_WARNING "Invalid micro-assembler branch target\n"); - - return ((arg < 0) ? (1 << 15) : 0) | ((arg >> 2) & 0x7fff); -} - -static inline __uasminit u32 build_jimm(u32 arg) -{ - WARN(arg & ~(JIMM_MASK << 2), - KERN_WARNING "Micro-assembler field overflow\n"); - - return (arg >> 2) & JIMM_MASK; -} - static inline __uasminit u32 build_scimm(u32 arg) { WARN(arg & ~SCIMM_MASK, @@ -239,55 +128,7 @@ static inline __uasminit u32 build_set(u32 arg) return arg & SET_MASK; } -/* - * The order of opcode arguments is implicitly left to right, - * starting with RS and ending with FUNC or IMM. - */ -static void __uasminit build_insn(u32 **buf, enum opcode opc, ...) -{ - struct insn *ip = NULL; - unsigned int i; - va_list ap; - u32 op; - - for (i = 0; insn_table[i].opcode != insn_invalid; i++) - if (insn_table[i].opcode == opc) { - ip = &insn_table[i]; - break; - } - - if (!ip || (opc == insn_daddiu && r4k_daddiu_bug())) - panic("Unsupported Micro-assembler instruction %d", opc); - - op = ip->match; - va_start(ap, opc); - if (ip->fields & RS) - op |= build_rs(va_arg(ap, u32)); - if (ip->fields & RT) - op |= build_rt(va_arg(ap, u32)); - if (ip->fields & RD) - op |= build_rd(va_arg(ap, u32)); - if (ip->fields & RE) - op |= build_re(va_arg(ap, u32)); - if (ip->fields & SIMM) - op |= build_simm(va_arg(ap, s32)); - if (ip->fields & UIMM) - op |= build_uimm(va_arg(ap, u32)); - if (ip->fields & BIMM) - op |= build_bimm(va_arg(ap, s32)); - if (ip->fields & JIMM) - op |= build_jimm(va_arg(ap, u32)); - if (ip->fields & FUNC) - op |= build_func(va_arg(ap, u32)); - if (ip->fields & SET) - op |= build_set(va_arg(ap, u32)); - if (ip->fields & SCIMM) - op |= build_scimm(va_arg(ap, u32)); - va_end(ap); - - **buf = op; - (*buf)++; -} +static void __uasminit build_insn(u32 **buf, enum opcode opc, ...); #define I_u1u2u3(op) \ Ip_u1u2u3(op) \ @@ -445,7 +286,7 @@ I_u3u1u2(_ldx) #ifdef CONFIG_CPU_CAVIUM_OCTEON #include -void __uasminit uasm_i_pref(u32 **buf, unsigned int a, signed int b, +void __uasminit ISAFUNC(uasm_i_pref)(u32 **buf, unsigned int a, signed int b, unsigned int c) { if (OCTEON_IS_MODEL(OCTEON_CN63XX_PASS1_X) && a <= 24 && a != 5) @@ -457,21 +298,21 @@ void __uasminit uasm_i_pref(u32 **buf, unsigned int a, signed int b, else build_insn(buf, insn_pref, c, a, b); } -UASM_EXPORT_SYMBOL(uasm_i_pref); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_i_pref)); #else I_u2s3u1(_pref) #endif /* Handle labels. */ -void __uasminit uasm_build_label(struct uasm_label **lab, u32 *addr, int lid) +void __uasminit ISAFUNC(uasm_build_label)(struct uasm_label **lab, u32 *addr, int lid) { (*lab)->addr = addr; (*lab)->lab = lid; (*lab)++; } -UASM_EXPORT_SYMBOL(uasm_build_label); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_build_label)); -int __uasminit uasm_in_compat_space_p(long addr) +int __uasminit ISAFUNC(uasm_in_compat_space_p)(long addr) { /* Is this address in 32bit compat space? */ #ifdef CONFIG_64BIT @@ -480,7 +321,7 @@ int __uasminit uasm_in_compat_space_p(long addr) return 1; #endif } -UASM_EXPORT_SYMBOL(uasm_in_compat_space_p); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_in_compat_space_p)); static int __uasminit uasm_rel_highest(long val) { @@ -500,77 +341,66 @@ static int __uasminit uasm_rel_higher(long val) #endif } -int __uasminit uasm_rel_hi(long val) +int __uasminit ISAFUNC(uasm_rel_hi)(long val) { return ((((val + 0x8000L) >> 16) & 0xffff) ^ 0x8000) - 0x8000; } -UASM_EXPORT_SYMBOL(uasm_rel_hi); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_rel_hi)); -int __uasminit uasm_rel_lo(long val) +int __uasminit ISAFUNC(uasm_rel_lo)(long val) { return ((val & 0xffff) ^ 0x8000) - 0x8000; } -UASM_EXPORT_SYMBOL(uasm_rel_lo); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_rel_lo)); -void __uasminit UASM_i_LA_mostly(u32 **buf, unsigned int rs, long addr) +void __uasminit ISAFUNC(UASM_i_LA_mostly)(u32 **buf, unsigned int rs, long addr) { - if (!uasm_in_compat_space_p(addr)) { - uasm_i_lui(buf, rs, uasm_rel_highest(addr)); + if (!ISAFUNC(uasm_in_compat_space_p)(addr)) { + ISAFUNC(uasm_i_lui)(buf, rs, uasm_rel_highest(addr)); if (uasm_rel_higher(addr)) - uasm_i_daddiu(buf, rs, rs, uasm_rel_higher(addr)); - if (uasm_rel_hi(addr)) { - uasm_i_dsll(buf, rs, rs, 16); - uasm_i_daddiu(buf, rs, rs, uasm_rel_hi(addr)); - uasm_i_dsll(buf, rs, rs, 16); + ISAFUNC(uasm_i_daddiu)(buf, rs, rs, uasm_rel_higher(addr)); + if (ISAFUNC(uasm_rel_hi(addr))) { + ISAFUNC(uasm_i_dsll)(buf, rs, rs, 16); + ISAFUNC(uasm_i_daddiu)(buf, rs, rs, + ISAFUNC(uasm_rel_hi)(addr)); + ISAFUNC(uasm_i_dsll)(buf, rs, rs, 16); } else - uasm_i_dsll32(buf, rs, rs, 0); + ISAFUNC(uasm_i_dsll32)(buf, rs, rs, 0); } else - uasm_i_lui(buf, rs, uasm_rel_hi(addr)); + ISAFUNC(uasm_i_lui)(buf, rs, ISAFUNC(uasm_rel_hi(addr))); } -UASM_EXPORT_SYMBOL(UASM_i_LA_mostly); +UASM_EXPORT_SYMBOL(ISAFUNC(UASM_i_LA_mostly)); -void __uasminit UASM_i_LA(u32 **buf, unsigned int rs, long addr) +void __uasminit ISAFUNC(UASM_i_LA)(u32 **buf, unsigned int rs, long addr) { - UASM_i_LA_mostly(buf, rs, addr); - if (uasm_rel_lo(addr)) { - if (!uasm_in_compat_space_p(addr)) - uasm_i_daddiu(buf, rs, rs, uasm_rel_lo(addr)); + ISAFUNC(UASM_i_LA_mostly)(buf, rs, addr); + if (ISAFUNC(uasm_rel_lo(addr))) { + if (!ISAFUNC(uasm_in_compat_space_p)(addr)) + ISAFUNC(uasm_i_daddiu)(buf, rs, rs, + ISAFUNC(uasm_rel_lo(addr))); else - uasm_i_addiu(buf, rs, rs, uasm_rel_lo(addr)); + ISAFUNC(uasm_i_addiu)(buf, rs, rs, + ISAFUNC(uasm_rel_lo(addr))); } } -UASM_EXPORT_SYMBOL(UASM_i_LA); +UASM_EXPORT_SYMBOL(ISAFUNC(UASM_i_LA)); /* Handle relocations. */ void __uasminit -uasm_r_mips_pc16(struct uasm_reloc **rel, u32 *addr, int lid) +ISAFUNC(uasm_r_mips_pc16)(struct uasm_reloc **rel, u32 *addr, int lid) { (*rel)->addr = addr; (*rel)->type = R_MIPS_PC16; (*rel)->lab = lid; (*rel)++; } -UASM_EXPORT_SYMBOL(uasm_r_mips_pc16); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_r_mips_pc16)); static inline void __uasminit -__resolve_relocs(struct uasm_reloc *rel, struct uasm_label *lab) -{ - long laddr = (long)lab->addr; - long raddr = (long)rel->addr; - - switch (rel->type) { - case R_MIPS_PC16: - *rel->addr |= build_bimm(laddr - (raddr + 4)); - break; - - default: - panic("Unsupported Micro-assembler relocation %d", - rel->type); - } -} +__resolve_relocs(struct uasm_reloc *rel, struct uasm_label *lab); void __uasminit -uasm_resolve_relocs(struct uasm_reloc *rel, struct uasm_label *lab) +ISAFUNC(uasm_resolve_relocs)(struct uasm_reloc *rel, struct uasm_label *lab) { struct uasm_label *l; @@ -579,40 +409,40 @@ uasm_resolve_relocs(struct uasm_reloc *rel, struct uasm_label *lab) if (rel->lab == l->lab) __resolve_relocs(rel, l); } -UASM_EXPORT_SYMBOL(uasm_resolve_relocs); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_resolve_relocs)); void __uasminit -uasm_move_relocs(struct uasm_reloc *rel, u32 *first, u32 *end, long off) +ISAFUNC(uasm_move_relocs)(struct uasm_reloc *rel, u32 *first, u32 *end, long off) { for (; rel->lab != UASM_LABEL_INVALID; rel++) if (rel->addr >= first && rel->addr < end) rel->addr += off; } -UASM_EXPORT_SYMBOL(uasm_move_relocs); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_move_relocs)); void __uasminit -uasm_move_labels(struct uasm_label *lab, u32 *first, u32 *end, long off) +ISAFUNC(uasm_move_labels)(struct uasm_label *lab, u32 *first, u32 *end, long off) { for (; lab->lab != UASM_LABEL_INVALID; lab++) if (lab->addr >= first && lab->addr < end) lab->addr += off; } -UASM_EXPORT_SYMBOL(uasm_move_labels); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_move_labels)); void __uasminit -uasm_copy_handler(struct uasm_reloc *rel, struct uasm_label *lab, u32 *first, +ISAFUNC(uasm_copy_handler)(struct uasm_reloc *rel, struct uasm_label *lab, u32 *first, u32 *end, u32 *target) { long off = (long)(target - first); memcpy(target, first, (end - first) * sizeof(u32)); - uasm_move_relocs(rel, first, end, off); - uasm_move_labels(lab, first, end, off); + ISAFUNC(uasm_move_relocs(rel, first, end, off)); + ISAFUNC(uasm_move_labels(lab, first, end, off)); } -UASM_EXPORT_SYMBOL(uasm_copy_handler); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_copy_handler)); -int __uasminit uasm_insn_has_bdelay(struct uasm_reloc *rel, u32 *addr) +int __uasminit ISAFUNC(uasm_insn_has_bdelay)(struct uasm_reloc *rel, u32 *addr) { for (; rel->lab != UASM_LABEL_INVALID; rel++) { if (rel->addr == addr @@ -623,88 +453,88 @@ int __uasminit uasm_insn_has_bdelay(struct uasm_reloc *rel, u32 *addr) return 0; } -UASM_EXPORT_SYMBOL(uasm_insn_has_bdelay); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_insn_has_bdelay)); /* Convenience functions for labeled branches. */ void __uasminit -uasm_il_bltz(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) +ISAFUNC(uasm_il_bltz)(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_bltz(p, reg, 0); + ISAFUNC(uasm_i_bltz)(p, reg, 0); } -UASM_EXPORT_SYMBOL(uasm_il_bltz); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_bltz)); void __uasminit -uasm_il_b(u32 **p, struct uasm_reloc **r, int lid) +ISAFUNC(uasm_il_b)(u32 **p, struct uasm_reloc **r, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_b(p, 0); + ISAFUNC(uasm_i_b)(p, 0); } -UASM_EXPORT_SYMBOL(uasm_il_b); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_b)); void __uasminit -uasm_il_beqz(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) +ISAFUNC(uasm_il_beqz)(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_beqz(p, reg, 0); + ISAFUNC(uasm_i_beqz)(p, reg, 0); } -UASM_EXPORT_SYMBOL(uasm_il_beqz); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_beqz)); void __uasminit -uasm_il_beqzl(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) +ISAFUNC(uasm_il_beqzl)(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_beqzl(p, reg, 0); + ISAFUNC(uasm_i_beqzl)(p, reg, 0); } -UASM_EXPORT_SYMBOL(uasm_il_beqzl); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_beqzl)); void __uasminit -uasm_il_bne(u32 **p, struct uasm_reloc **r, unsigned int reg1, +ISAFUNC(uasm_il_bne)(u32 **p, struct uasm_reloc **r, unsigned int reg1, unsigned int reg2, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_bne(p, reg1, reg2, 0); + ISAFUNC(uasm_i_bne)(p, reg1, reg2, 0); } -UASM_EXPORT_SYMBOL(uasm_il_bne); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_bne)); void __uasminit -uasm_il_bnez(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) +ISAFUNC(uasm_il_bnez)(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_bnez(p, reg, 0); + ISAFUNC(uasm_i_bnez)(p, reg, 0); } -UASM_EXPORT_SYMBOL(uasm_il_bnez); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_bnez)); void __uasminit -uasm_il_bgezl(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) +ISAFUNC(uasm_il_bgezl)(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_bgezl(p, reg, 0); + ISAFUNC(uasm_i_bgezl)(p, reg, 0); } -UASM_EXPORT_SYMBOL(uasm_il_bgezl); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_bgezl)); void __uasminit -uasm_il_bgez(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) +ISAFUNC(uasm_il_bgez)(u32 **p, struct uasm_reloc **r, unsigned int reg, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_bgez(p, reg, 0); + ISAFUNC(uasm_i_bgez)(p, reg, 0); } -UASM_EXPORT_SYMBOL(uasm_il_bgez); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_bgez)); void __uasminit -uasm_il_bbit0(u32 **p, struct uasm_reloc **r, unsigned int reg, +ISAFUNC(uasm_il_bbit0)(u32 **p, struct uasm_reloc **r, unsigned int reg, unsigned int bit, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_bbit0(p, reg, bit, 0); + ISAFUNC(uasm_i_bbit0)(p, reg, bit, 0); } -UASM_EXPORT_SYMBOL(uasm_il_bbit0); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_bbit0)); void __uasminit -uasm_il_bbit1(u32 **p, struct uasm_reloc **r, unsigned int reg, +ISAFUNC(uasm_il_bbit1)(u32 **p, struct uasm_reloc **r, unsigned int reg, unsigned int bit, int lid) { uasm_r_mips_pc16(r, *p, lid); - uasm_i_bbit1(p, reg, bit, 0); + ISAFUNC(uasm_i_bbit1)(p, reg, bit, 0); } -UASM_EXPORT_SYMBOL(uasm_il_bbit1); +UASM_EXPORT_SYMBOL(ISAFUNC(uasm_il_bbit1)); diff --git a/arch/mips/mti-malta/Makefile b/arch/mips/mti-malta/Makefile index 6079ef33b5f0..0388fc8b5613 100644 --- a/arch/mips/mti-malta/Makefile +++ b/arch/mips/mti-malta/Makefile @@ -5,9 +5,8 @@ # Copyright (C) 2008 Wind River Systems, Inc. # written by Ralf Baechle # -obj-y := malta-amon.o malta-cmdline.o \ - malta-display.o malta-init.o malta-int.o \ - malta-memory.o malta-platform.o \ +obj-y := malta-amon.o malta-display.o malta-init.o \ + malta-int.o malta-memory.o malta-platform.o \ malta-reset.o malta-setup.o malta-time.o obj-$(CONFIG_EARLY_PRINTK) += malta-console.o diff --git a/arch/mips/mti-malta/Platform b/arch/mips/mti-malta/Platform index 5b548b5a4fcf..2cc72c9b38e3 100644 --- a/arch/mips/mti-malta/Platform +++ b/arch/mips/mti-malta/Platform @@ -3,5 +3,9 @@ # platform-$(CONFIG_MIPS_MALTA) += mti-malta/ cflags-$(CONFIG_MIPS_MALTA) += -I$(srctree)/arch/mips/include/asm/mach-malta -load-$(CONFIG_MIPS_MALTA) += 0xffffffff80100000 +ifdef CONFIG_KVM_GUEST + load-$(CONFIG_MIPS_MALTA) += 0x0000000040100000 +else + load-$(CONFIG_MIPS_MALTA) += 0xffffffff80100000 +endif all-$(CONFIG_MIPS_MALTA) := $(COMPRESSION_FNAME).bin diff --git a/arch/mips/mti-malta/malta-cmdline.c b/arch/mips/mti-malta/malta-cmdline.c deleted file mode 100644 index 5576a306a145..000000000000 --- a/arch/mips/mti-malta/malta-cmdline.c +++ /dev/null @@ -1,59 +0,0 @@ -/* - * Carsten Langgaard, carstenl@mips.com - * Copyright (C) 1999,2000 MIPS Technologies, Inc. All rights reserved. - * - * This program is free software; you can distribute it and/or modify it - * under the terms of the GNU General Public License (Version 2) as - * published by the Free Software Foundation. - * - * This program is distributed in the hope it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * for more details. - * - * You should have received a copy of the GNU General Public License along - * with this program; if not, write to the Free Software Foundation, Inc., - * 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. - * - * Kernel command line creation using the prom monitor (YAMON) argc/argv. - */ -#include -#include - -#include - -extern int prom_argc; -extern int *_prom_argv; - -/* - * YAMON (32-bit PROM) pass arguments and environment as 32-bit pointer. - * This macro take care of sign extension. - */ -#define prom_argv(index) ((char *)(long)_prom_argv[(index)]) - -char * __init prom_getcmdline(void) -{ - return &(arcs_cmdline[0]); -} - - -void __init prom_init_cmdline(void) -{ - char *cp; - int actr; - - actr = 1; /* Always ignore argv[0] */ - - cp = &(arcs_cmdline[0]); - while(actr < prom_argc) { - strcpy(cp, prom_argv(actr)); - cp += strlen(prom_argv(actr)); - *cp++ = ' '; - actr++; - } - if (cp != &(arcs_cmdline[0])) { - /* get rid of trailing space */ - --cp; - *cp = '\0'; - } -} diff --git a/arch/mips/mti-malta/malta-display.c b/arch/mips/mti-malta/malta-display.c index 9bc58a24e80a..d4f807191ecd 100644 --- a/arch/mips/mti-malta/malta-display.c +++ b/arch/mips/mti-malta/malta-display.c @@ -1,28 +1,20 @@ /* - * Carsten Langgaard, carstenl@mips.com - * Copyright (C) 1999,2000 MIPS Technologies, Inc. All rights reserved. - * - * This program is free software; you can distribute it and/or modify it - * under the terms of the GNU General Public License (Version 2) as - * published by the Free Software Foundation. - * - * This program is distributed in the hope it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * for more details. - * - * You should have received a copy of the GNU General Public License along - * with this program; if not, write to the Free Software Foundation, Inc., - * 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. * * Display routines for display messages in MIPS boards ascii display. + * + * Copyright (C) 1999,2000,2012 MIPS Technologies, Inc. + * All rights reserved. + * Authors: Carsten Langgaard + * Steven J. Hill */ - #include #include -#include +#include + #include -#include extern const char display_string[]; static unsigned int display_count; @@ -36,11 +28,11 @@ void mips_display_message(const char *str) if (unlikely(display == NULL)) display = ioremap(ASCII_DISPLAY_POS_BASE, 16*sizeof(int)); - for (i = 0; i <= 14; i=i+2) { - if (*str) - __raw_writel(*str++, display + i); - else - __raw_writel(' ', display + i); + for (i = 0; i <= 14; i += 2) { + if (*str) + __raw_writel(*str++, display + i); + else + __raw_writel(' ', display + i); } } diff --git a/arch/mips/mti-malta/malta-init.c b/arch/mips/mti-malta/malta-init.c index c2cbce9e435e..ff8caffd3266 100644 --- a/arch/mips/mti-malta/malta-init.c +++ b/arch/mips/mti-malta/malta-init.c @@ -1,54 +1,28 @@ /* - * Copyright (C) 1999, 2000, 2004, 2005 MIPS Technologies, Inc. - * All rights reserved. - * Authors: Carsten Langgaard - * Maciej W. Rozycki - * - * This program is free software; you can distribute it and/or modify it - * under the terms of the GNU General Public License (Version 2) as - * published by the Free Software Foundation. - * - * This program is distributed in the hope it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * for more details. - * - * You should have received a copy of the GNU General Public License along - * with this program; if not, write to the Free Software Foundation, Inc., - * 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. * * PROM library initialisation code. + * + * Copyright (C) 1999,2000,2004,2005,2012 MIPS Technologies, Inc. + * All rights reserved. + * Authors: Carsten Langgaard + * Maciej W. Rozycki + * Steven J. Hill */ #include #include #include -#include -#include -#include #include #include #include - +#include #include -#include #include -#include -#include - #include -int prom_argc; -int *_prom_argv, *_prom_envp; - -/* - * YAMON (32-bit PROM) pass arguments and environment as 32-bit pointer. - * This macro take care of sign extension, if running in 64-bit mode. - */ -#define prom_envp(index) ((char *)(long)_prom_envp[(index)]) - -int init_debug; - static int mips_revision_corid; int mips_revision_sconid; @@ -62,74 +36,6 @@ unsigned long _pcictrl_gt64120; /* MIPS System controller register base */ unsigned long _pcictrl_msc; -char *prom_getenv(char *envname) -{ - /* - * Return a pointer to the given environment variable. - * In 64-bit mode: we're using 64-bit pointers, but all pointers - * in the PROM structures are only 32-bit, so we need some - * workarounds, if we are running in 64-bit mode. - */ - int i, index=0; - - i = strlen(envname); - - while (prom_envp(index)) { - if(strncmp(envname, prom_envp(index), i) == 0) { - return(prom_envp(index+1)); - } - index += 2; - } - - return NULL; -} - -static inline unsigned char str2hexnum(unsigned char c) -{ - if (c >= '0' && c <= '9') - return c - '0'; - if (c >= 'a' && c <= 'f') - return c - 'a' + 10; - return 0; /* foo */ -} - -static inline void str2eaddr(unsigned char *ea, unsigned char *str) -{ - int i; - - for (i = 0; i < 6; i++) { - unsigned char num; - - if((*str == '.') || (*str == ':')) - str++; - num = str2hexnum(*str++) << 4; - num |= (str2hexnum(*str++)); - ea[i] = num; - } -} - -int get_ethernet_addr(char *ethernet_addr) -{ - char *ethaddr_str; - - ethaddr_str = prom_getenv("ethaddr"); - if (!ethaddr_str) { - printk("ethaddr not set in boot prom\n"); - return -1; - } - str2eaddr(ethernet_addr, ethaddr_str); - - if (init_debug > 1) { - int i; - printk("get_ethernet_addr: "); - for (i=0; i<5; i++) - printk("%02x:", (unsigned char)*(ethernet_addr+i)); - printk("%02x\n", *(ethernet_addr+i)); - } - - return 0; -} - #ifdef CONFIG_SERIAL_8250_CONSOLE static void __init console_config(void) { @@ -138,17 +44,23 @@ static void __init console_config(void) char parity = '\0', bits = '\0', flow = '\0'; char *s; - if ((strstr(prom_getcmdline(), "console=")) == NULL) { - s = prom_getenv("modetty0"); + if ((strstr(fw_getcmdline(), "console=")) == NULL) { + s = fw_getenv("modetty0"); if (s) { while (*s >= '0' && *s <= '9') baud = baud*10 + *s++ - '0'; - if (*s == ',') s++; - if (*s) parity = *s++; - if (*s == ',') s++; - if (*s) bits = *s++; - if (*s == ',') s++; - if (*s == 'h') flow = 'r'; + if (*s == ',') + s++; + if (*s) + parity = *s++; + if (*s == ',') + s++; + if (*s) + bits = *s++; + if (*s == ',') + s++; + if (*s == 'h') + flow = 'r'; } if (baud == 0) baud = 38400; @@ -158,8 +70,9 @@ static void __init console_config(void) bits = '8'; if (flow == '\0') flow = 'r'; - sprintf(console_string, " console=ttyS0,%d%c%c%c", baud, parity, bits, flow); - strcat(prom_getcmdline(), console_string); + sprintf(console_string, " console=ttyS0,%d%c%c%c", baud, + parity, bits, flow); + strcat(fw_getcmdline(), console_string); pr_info("Config serial console:%s\n", console_string); } } @@ -193,10 +106,6 @@ extern struct plat_smp_ops msmtc_smp_ops; void __init prom_init(void) { - prom_argc = fw_arg0; - _prom_argv = (int *) fw_arg1; - _prom_envp = (int *) fw_arg2; - mips_display_message("LINUX"); /* @@ -306,7 +215,7 @@ void __init prom_init(void) case MIPS_REVISION_SCON_SOCIT: case MIPS_REVISION_SCON_ROCIT: _pcictrl_msc = (unsigned long)ioremap(MIPS_MSC01_PCI_REG_BASE, 0x2000); - mips_pci_controller: +mips_pci_controller: mb(); MSC_READ(MSC01_PCI_CFG, data); MSC_WRITE(MSC01_PCI_CFG, data & ~MSC01_PCI_CFG_EN_BIT); @@ -348,13 +257,13 @@ void __init prom_init(void) default: /* Unknown system controller */ mips_display_message("SC Error"); - while (1); /* We die here... */ + while (1); /* We die here... */ } board_nmi_handler_setup = mips_nmi_setup; board_ejtag_handler_setup = mips_ejtag_setup; - prom_init_cmdline(); - prom_meminit(); + fw_init_cmdline(); + fw_meminit(); #ifdef CONFIG_SERIAL_8250_CONSOLE console_config(); #endif diff --git a/arch/mips/mti-malta/malta-int.c b/arch/mips/mti-malta/malta-int.c index e364af70e6cf..0a1339ac3ec8 100644 --- a/arch/mips/mti-malta/malta-int.c +++ b/arch/mips/mti-malta/malta-int.c @@ -47,7 +47,6 @@ #include int gcmp_present = -1; -int gic_present; static unsigned long _msc01_biu_base; static unsigned long _gcmp_base; static unsigned int ipi_map[NR_CPUS]; @@ -134,6 +133,9 @@ static void malta_ipi_irqdispatch(void) { int irq; + if (gic_compare_int()) + do_IRQ(MIPS_GIC_IRQ_BASE); + irq = gic_get_int(); if (irq < 0) return; /* interrupt has already been cleared */ diff --git a/arch/mips/mti-malta/malta-memory.c b/arch/mips/mti-malta/malta-memory.c index f3d43aa023a9..1f73d63e92a7 100644 --- a/arch/mips/mti-malta/malta-memory.c +++ b/arch/mips/mti-malta/malta-memory.c @@ -1,73 +1,45 @@ /* - * Carsten Langgaard, carstenl@mips.com - * Copyright (C) 1999,2000 MIPS Technologies, Inc. All rights reserved. - * - * This program is free software; you can distribute it and/or modify it - * under the terms of the GNU General Public License (Version 2) as - * published by the Free Software Foundation. - * - * This program is distributed in the hope it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * for more details. - * - * You should have received a copy of the GNU General Public License along - * with this program; if not, write to the Free Software Foundation, Inc., - * 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. * * PROM library functions for acquiring/using memory descriptors given to * us from the YAMON. + * + * Copyright (C) 1999,2000,2012 MIPS Technologies, Inc. + * All rights reserved. + * Authors: Carsten Langgaard + * Steven J. Hill */ #include -#include #include -#include #include #include -#include #include +#include -#include - -/*#define DEBUG*/ - -enum yamon_memtypes { - yamon_dontuse, - yamon_prom, - yamon_free, -}; -static struct prom_pmemblock mdesc[PROM_MAX_PMEMBLOCKS]; - -#ifdef DEBUG -static char *mtypes[3] = { - "Dont use memory", - "YAMON PROM memory", - "Free memory", -}; -#endif +static fw_memblock_t mdesc[FW_MAX_MEMBLOCKS]; /* determined physical memory size, not overridden by command line args */ unsigned long physical_memsize = 0L; -static struct prom_pmemblock * __init prom_getmdesc(void) +fw_memblock_t * __init fw_getmdesc(void) { - char *memsize_str; + char *memsize_str, *ptr; unsigned int memsize; - char *ptr; static char cmdline[COMMAND_LINE_SIZE] __initdata; + long val; + int tmp; /* otherwise look in the environment */ - memsize_str = prom_getenv("memsize"); + memsize_str = fw_getenv("memsize"); if (!memsize_str) { - printk(KERN_WARNING - "memsize not set in boot prom, set to default (32Mb)\n"); + pr_warn("memsize not set in YAMON, set to default (32Mb)\n"); physical_memsize = 0x02000000; } else { -#ifdef DEBUG - pr_debug("prom_memsize = %s\n", memsize_str); -#endif - physical_memsize = simple_strtol(memsize_str, NULL, 0); + tmp = kstrtol(memsize_str, 0, &val); + physical_memsize = (unsigned long)val; } #ifdef CONFIG_CPU_BIG_ENDIAN @@ -90,11 +62,11 @@ static struct prom_pmemblock * __init prom_getmdesc(void) memset(mdesc, 0, sizeof(mdesc)); - mdesc[0].type = yamon_dontuse; + mdesc[0].type = fw_dontuse; mdesc[0].base = 0x00000000; mdesc[0].size = 0x00001000; - mdesc[1].type = yamon_prom; + mdesc[1].type = fw_code; mdesc[1].base = 0x00001000; mdesc[1].size = 0x000ef000; @@ -105,55 +77,45 @@ static struct prom_pmemblock * __init prom_getmdesc(void) * This mean that this area can't be used as DMA memory for PCI * devices. */ - mdesc[2].type = yamon_dontuse; + mdesc[2].type = fw_dontuse; mdesc[2].base = 0x000f0000; mdesc[2].size = 0x00010000; - mdesc[3].type = yamon_dontuse; + mdesc[3].type = fw_dontuse; mdesc[3].base = 0x00100000; - mdesc[3].size = CPHYSADDR(PFN_ALIGN((unsigned long)&_end)) - mdesc[3].base; + mdesc[3].size = CPHYSADDR(PFN_ALIGN((unsigned long)&_end)) - + mdesc[3].base; - mdesc[4].type = yamon_free; + mdesc[4].type = fw_free; mdesc[4].base = CPHYSADDR(PFN_ALIGN(&_end)); mdesc[4].size = memsize - mdesc[4].base; return &mdesc[0]; } -static int __init prom_memtype_classify(unsigned int type) +static int __init fw_memtype_classify(unsigned int type) { switch (type) { - case yamon_free: + case fw_free: return BOOT_MEM_RAM; - case yamon_prom: + case fw_code: return BOOT_MEM_ROM_DATA; default: return BOOT_MEM_RESERVED; } } -void __init prom_meminit(void) +void __init fw_meminit(void) { - struct prom_pmemblock *p; + fw_memblock_t *p; -#ifdef DEBUG - pr_debug("YAMON MEMORY DESCRIPTOR dump:\n"); - p = prom_getmdesc(); - while (p->size) { - int i = 0; - pr_debug("[%d,%p]: base<%08lx> size<%08lx> type<%s>\n", - i, p, p->base, p->size, mtypes[p->type]); - p++; - i++; - } -#endif - p = prom_getmdesc(); + p = fw_getmdesc(); while (p->size) { long type; unsigned long base, size; - type = prom_memtype_classify(p->type); + type = fw_memtype_classify(p->type); base = p->base; size = p->size; @@ -172,7 +134,7 @@ void __init prom_free_prom_memory(void) continue; addr = boot_mem_map.map[i].addr; - free_init_pages("prom memory", + free_init_pages("YAMON memory", addr, addr + boot_mem_map.map[i].size); } } diff --git a/arch/mips/mti-malta/malta-setup.c b/arch/mips/mti-malta/malta-setup.c index 200f64df2c9b..c72a06936781 100644 --- a/arch/mips/mti-malta/malta-setup.c +++ b/arch/mips/mti-malta/malta-setup.c @@ -25,13 +25,13 @@ #include #include -#include +#include #include -#include #include #include #include #include +#include #ifdef CONFIG_VT #include #endif @@ -105,6 +105,66 @@ static void __init fd_activate(void) } #endif +static int __init plat_enable_iocoherency(void) +{ + int supported = 0; + if (mips_revision_sconid == MIPS_REVISION_SCON_BONITO) { + if (BONITO_PCICACHECTRL & BONITO_PCICACHECTRL_CPUCOH_PRES) { + BONITO_PCICACHECTRL |= BONITO_PCICACHECTRL_CPUCOH_EN; + pr_info("Enabled Bonito CPU coherency\n"); + supported = 1; + } + if (strstr(fw_getcmdline(), "iobcuncached")) { + BONITO_PCICACHECTRL &= ~BONITO_PCICACHECTRL_IOBCCOH_EN; + BONITO_PCIMEMBASECFG = BONITO_PCIMEMBASECFG & + ~(BONITO_PCIMEMBASECFG_MEMBASE0_CACHED | + BONITO_PCIMEMBASECFG_MEMBASE1_CACHED); + pr_info("Disabled Bonito IOBC coherency\n"); + } else { + BONITO_PCICACHECTRL |= BONITO_PCICACHECTRL_IOBCCOH_EN; + BONITO_PCIMEMBASECFG |= + (BONITO_PCIMEMBASECFG_MEMBASE0_CACHED | + BONITO_PCIMEMBASECFG_MEMBASE1_CACHED); + pr_info("Enabled Bonito IOBC coherency\n"); + } + } else if (gcmp_niocu() != 0) { + /* Nothing special needs to be done to enable coherency */ + pr_info("CMP IOCU detected\n"); + if ((*(unsigned int *)0xbf403000 & 0x81) != 0x81) { + pr_crit("IOCU OPERATION DISABLED BY SWITCH - DEFAULTING TO SW IO COHERENCY\n"); + return 0; + } + supported = 1; + } + hw_coherentio = supported; + return supported; +} + +static void __init plat_setup_iocoherency(void) +{ +#ifdef CONFIG_DMA_NONCOHERENT + /* + * Kernel has been configured with software coherency + * but we might choose to turn it off and use hardware + * coherency instead. + */ + if (plat_enable_iocoherency()) { + if (coherentio == 0) + pr_info("Hardware DMA cache coherency disabled\n"); + else + pr_info("Hardware DMA cache coherency enabled\n"); + } else { + if (coherentio == 1) + pr_info("Hardware DMA cache coherency unsupported, but enabled from command line!\n"); + else + pr_info("Software DMA cache coherency enabled\n"); + } +#else + if (!plat_enable_iocoherency()) + panic("Hardware DMA cache coherency not supported!"); +#endif +} + #ifdef CONFIG_BLK_DEV_IDE static void __init pci_clock_check(void) { @@ -115,16 +175,15 @@ static void __init pci_clock_check(void) 33, 20, 25, 30, 12, 16, 37, 10 }; int pciclock = pciclocks[jmpr]; - char *argptr = prom_getcmdline(); + char *argptr = fw_getcmdline(); if (pciclock != 33 && !strstr(argptr, "idebus=")) { - printk(KERN_WARNING "WARNING: PCI clock is %dMHz, " - "setting idebus\n", pciclock); + pr_warn("WARNING: PCI clock is %dMHz, setting idebus\n", + pciclock); argptr += strlen(argptr); sprintf(argptr, " idebus=%d", pciclock); if (pciclock < 20 || pciclock > 66) - printk(KERN_WARNING "WARNING: IDE timing " - "calculations will be incorrect\n"); + pr_warn("WARNING: IDE timing calculations will be incorrect\n"); } } #endif @@ -153,31 +212,31 @@ static void __init bonito_quirks_setup(void) { char *argptr; - argptr = prom_getcmdline(); + argptr = fw_getcmdline(); if (strstr(argptr, "debug")) { BONITO_BONGENCFG |= BONITO_BONGENCFG_DEBUGMODE; - printk(KERN_INFO "Enabled Bonito debug mode\n"); + pr_info("Enabled Bonito debug mode\n"); } else BONITO_BONGENCFG &= ~BONITO_BONGENCFG_DEBUGMODE; #ifdef CONFIG_DMA_COHERENT if (BONITO_PCICACHECTRL & BONITO_PCICACHECTRL_CPUCOH_PRES) { BONITO_PCICACHECTRL |= BONITO_PCICACHECTRL_CPUCOH_EN; - printk(KERN_INFO "Enabled Bonito CPU coherency\n"); + pr_info("Enabled Bonito CPU coherency\n"); - argptr = prom_getcmdline(); + argptr = fw_getcmdline(); if (strstr(argptr, "iobcuncached")) { BONITO_PCICACHECTRL &= ~BONITO_PCICACHECTRL_IOBCCOH_EN; BONITO_PCIMEMBASECFG = BONITO_PCIMEMBASECFG & ~(BONITO_PCIMEMBASECFG_MEMBASE0_CACHED | BONITO_PCIMEMBASECFG_MEMBASE1_CACHED); - printk(KERN_INFO "Disabled Bonito IOBC coherency\n"); + pr_info("Disabled Bonito IOBC coherency\n"); } else { BONITO_PCICACHECTRL |= BONITO_PCICACHECTRL_IOBCCOH_EN; BONITO_PCIMEMBASECFG |= (BONITO_PCIMEMBASECFG_MEMBASE0_CACHED | BONITO_PCIMEMBASECFG_MEMBASE1_CACHED); - printk(KERN_INFO "Enabled Bonito IOBC coherency\n"); + pr_info("Enabled Bonito IOBC coherency\n"); } } else panic("Hardware DMA cache coherency not supported"); @@ -207,6 +266,8 @@ void __init plat_mem_setup(void) if (mips_revision_sconid == MIPS_REVISION_SCON_BONITO) bonito_quirks_setup(); + plat_setup_iocoherency(); + #ifdef CONFIG_BLK_DEV_IDE pci_clock_check(); #endif diff --git a/arch/mips/mti-malta/malta-time.c b/arch/mips/mti-malta/malta-time.c index a144b89cf9ba..0ad305f75802 100644 --- a/arch/mips/mti-malta/malta-time.c +++ b/arch/mips/mti-malta/malta-time.c @@ -39,12 +39,9 @@ #include #include -#include - #include unsigned long cpu_khz; -int gic_frequency; static int mips_cpu_timer_irq; static int mips_cpu_perf_irq; @@ -74,7 +71,24 @@ static void __init estimate_frequencies(void) { unsigned long flags; unsigned int count, start; +#ifdef CONFIG_IRQ_GIC unsigned int giccount = 0, gicstart = 0; +#endif + +#if defined (CONFIG_KVM_GUEST) && defined (CONFIG_KVM_HOST_FREQ) + unsigned int prid = read_c0_prid() & 0xffff00; + + /* + * XXXKYMA: hardwire the CPU frequency to Host Freq/4 + */ + count = (CONFIG_KVM_HOST_FREQ * 1000000) >> 3; + if ((prid != (PRID_COMP_MIPS | PRID_IMP_20KC)) && + (prid != (PRID_COMP_MIPS | PRID_IMP_25KF))) + count *= 2; + + mips_hpt_frequency = count; + return; +#endif local_irq_save(flags); @@ -84,26 +98,32 @@ static void __init estimate_frequencies(void) /* Initialize counters. */ start = read_c0_count(); +#ifdef CONFIG_IRQ_GIC if (gic_present) GICREAD(GIC_REG(SHARED, GIC_SH_COUNTER_31_00), gicstart); +#endif /* Read counter exactly on falling edge of update flag. */ while (CMOS_READ(RTC_REG_A) & RTC_UIP); while (!(CMOS_READ(RTC_REG_A) & RTC_UIP)); count = read_c0_count(); +#ifdef CONFIG_IRQ_GIC if (gic_present) GICREAD(GIC_REG(SHARED, GIC_SH_COUNTER_31_00), giccount); +#endif local_irq_restore(flags); count -= start; - if (gic_present) - giccount -= gicstart; - mips_hpt_frequency = count; - if (gic_present) + +#ifdef CONFIG_IRQ_GIC + if (gic_present) { + giccount -= gicstart; gic_frequency = giccount; + } +#endif } void read_persistent_clock(struct timespec *ts) @@ -159,24 +179,27 @@ void __init plat_time_init(void) (prid != (PRID_COMP_MIPS | PRID_IMP_25KF))) freq *= 2; freq = freqround(freq, 5000); - pr_debug("CPU frequency %d.%02d MHz\n", freq/1000000, + printk("CPU frequency %d.%02d MHz\n", freq/1000000, (freq%1000000)*100/1000000); cpu_khz = freq / 1000; - if (gic_present) { - freq = freqround(gic_frequency, 5000); - pr_debug("GIC frequency %d.%02d MHz\n", freq/1000000, - (freq%1000000)*100/1000000); - gic_clocksource_init(gic_frequency); - } else - init_r4k_clocksource(); + mips_scroll_message(); #ifdef CONFIG_I8253 /* Only Malta has a PIT. */ setup_pit_timer(); #endif - mips_scroll_message(); +#ifdef CONFIG_IRQ_GIC + if (gic_present) { + freq = freqround(gic_frequency, 5000); + printk("GIC frequency %d.%02d MHz\n", freq/1000000, + (freq%1000000)*100/1000000); +#ifdef CONFIG_CSRC_GIC + gic_clocksource_init(gic_frequency); +#endif + } +#endif plat_perf_setup(); } diff --git a/arch/mips/mti-sead3/Makefile b/arch/mips/mti-sead3/Makefile index 10ec701ce6c7..be114209217c 100644 --- a/arch/mips/mti-sead3/Makefile +++ b/arch/mips/mti-sead3/Makefile @@ -8,10 +8,10 @@ # Copyright (C) 2012 MIPS Technoligies, Inc. All rights reserved. # Steven J. Hill # -obj-y := sead3-lcd.o sead3-cmdline.o \ - sead3-display.o sead3-init.o sead3-int.o \ - sead3-mtd.o sead3-net.o sead3-platform.o \ - sead3-reset.o sead3-setup.o sead3-time.o +obj-y := sead3-lcd.o sead3-display.o sead3-init.o \ + sead3-int.o sead3-mtd.o sead3-net.o \ + sead3-platform.o sead3-reset.o \ + sead3-setup.o sead3-time.o obj-y += sead3-i2c-dev.o sead3-i2c.o \ sead3-pic32-i2c-drv.o sead3-pic32-bus.o \ diff --git a/arch/mips/mti-sead3/leds-sead3.c b/arch/mips/mti-sead3/leds-sead3.c index 322148c353ed..0a168c948b01 100644 --- a/arch/mips/mti-sead3/leds-sead3.c +++ b/arch/mips/mti-sead3/leds-sead3.c @@ -34,33 +34,15 @@ static void sead3_fled_set(struct led_classdev *led_cdev, static struct led_classdev sead3_pled = { .name = "sead3::pled", .brightness_set = sead3_pled_set, + .flags = LED_CORE_SUSPENDRESUME, }; static struct led_classdev sead3_fled = { .name = "sead3::fled", .brightness_set = sead3_fled_set, + .flags = LED_CORE_SUSPENDRESUME, }; -#ifdef CONFIG_PM -static int sead3_led_suspend(struct platform_device *dev, - pm_message_t state) -{ - led_classdev_suspend(&sead3_pled); - led_classdev_suspend(&sead3_fled); - return 0; -} - -static int sead3_led_resume(struct platform_device *dev) -{ - led_classdev_resume(&sead3_pled); - led_classdev_resume(&sead3_fled); - return 0; -} -#else -#define sead3_led_suspend NULL -#define sead3_led_resume NULL -#endif - static int sead3_led_probe(struct platform_device *pdev) { int ret; @@ -86,8 +68,6 @@ static int sead3_led_remove(struct platform_device *pdev) static struct platform_driver sead3_led_driver = { .probe = sead3_led_probe, .remove = sead3_led_remove, - .suspend = sead3_led_suspend, - .resume = sead3_led_resume, .driver = { .name = DRVNAME, .owner = THIS_MODULE, diff --git a/arch/mips/mti-sead3/sead3-cmdline.c b/arch/mips/mti-sead3/sead3-cmdline.c deleted file mode 100644 index a2e6cec67f57..000000000000 --- a/arch/mips/mti-sead3/sead3-cmdline.c +++ /dev/null @@ -1,46 +0,0 @@ -/* - * This file is subject to the terms and conditions of the GNU General Public - * License. See the file "COPYING" in the main directory of this archive - * for more details. - * - * Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved. - */ -#include -#include - -#include - -extern int prom_argc; -extern int *_prom_argv; - -/* - * YAMON (32-bit PROM) pass arguments and environment as 32-bit pointer. - * This macro take care of sign extension. - */ -#define prom_argv(index) ((char *)(long)_prom_argv[(index)]) - -char * __init prom_getcmdline(void) -{ - return &(arcs_cmdline[0]); -} - -void __init prom_init_cmdline(void) -{ - char *cp; - int actr; - - actr = 1; /* Always ignore argv[0] */ - - cp = &(arcs_cmdline[0]); - while (actr < prom_argc) { - strcpy(cp, prom_argv(actr)); - cp += strlen(prom_argv(actr)); - *cp++ = ' '; - actr++; - } - if (cp != &(arcs_cmdline[0])) { - /* get rid of trailing space */ - --cp; - *cp = '\0'; - } -} diff --git a/arch/mips/mti-sead3/sead3-console.c b/arch/mips/mti-sead3/sead3-console.c index 2ddef19a9adc..031f47d69770 100644 --- a/arch/mips/mti-sead3/sead3-console.c +++ b/arch/mips/mti-sead3/sead3-console.c @@ -26,7 +26,7 @@ static inline void serial_out(int offset, int value, unsigned int base_addr) __raw_writel(value, PORT(base_addr, offset)); } -void __init prom_init_early_console(char port) +void __init fw_init_early_console(char port) { console_port = port; } diff --git a/arch/mips/mti-sead3/sead3-display.c b/arch/mips/mti-sead3/sead3-display.c index e389326cfa42..94875991907b 100644 --- a/arch/mips/mti-sead3/sead3-display.c +++ b/arch/mips/mti-sead3/sead3-display.c @@ -8,7 +8,6 @@ #include #include #include -#include static unsigned int display_count; static unsigned int max_display_count; diff --git a/arch/mips/mti-sead3/sead3-init.c b/arch/mips/mti-sead3/sead3-init.c index f95abaa1aa5d..bfbd17b120a2 100644 --- a/arch/mips/mti-sead3/sead3-init.c +++ b/arch/mips/mti-sead3/sead3-init.c @@ -12,38 +12,51 @@ #include #include #include -#include - -extern void prom_init_early_console(char port); +#include extern char except_vec_nmi; extern char except_vec_ejtag_debug; -int prom_argc; -int *_prom_argv, *_prom_envp; - -#define prom_envp(index) ((char *)(long)_prom_envp[(index)]) - -char *prom_getenv(char *envname) +#ifdef CONFIG_SERIAL_8250_CONSOLE +static void __init console_config(void) { - /* - * Return a pointer to the given environment variable. - * In 64-bit mode: we're using 64-bit pointers, but all pointers - * in the PROM structures are only 32-bit, so we need some - * workarounds, if we are running in 64-bit mode. - */ - int i, index = 0; - - i = strlen(envname); - - while (prom_envp(index)) { - if (strncmp(envname, prom_envp(index), i) == 0) - return prom_envp(index+1); - index += 2; + char console_string[40]; + int baud = 0; + char parity = '\0', bits = '\0', flow = '\0'; + char *s; + + if ((strstr(fw_getcmdline(), "console=")) == NULL) { + s = fw_getenv("modetty0"); + if (s) { + while (*s >= '0' && *s <= '9') + baud = baud*10 + *s++ - '0'; + if (*s == ',') + s++; + if (*s) + parity = *s++; + if (*s == ',') + s++; + if (*s) + bits = *s++; + if (*s == ',') + s++; + if (*s == 'h') + flow = 'r'; + } + if (baud == 0) + baud = 38400; + if (parity != 'n' && parity != 'o' && parity != 'e') + parity = 'n'; + if (bits != '7' && bits != '8') + bits = '8'; + if (flow == '\0') + flow = 'r'; + sprintf(console_string, " console=ttyS0,%d%c%c%c", baud, + parity, bits, flow); + strcat(fw_getcmdline(), console_string); } - - return NULL; } +#endif static void __init mips_nmi_setup(void) { @@ -52,7 +65,41 @@ static void __init mips_nmi_setup(void) base = cpu_has_veic ? (void *)(CAC_BASE + 0xa80) : (void *)(CAC_BASE + 0x380); +#ifdef CONFIG_CPU_MICROMIPS + /* + * Decrement the exception vector address by one for microMIPS. + */ + memcpy(base, (&except_vec_nmi - 1), 0x80); + + /* + * This is a hack. We do not know if the boot loader was built with + * microMIPS instructions or not. If it was not, the NMI exception + * code at 0x80000a80 will be taken in MIPS32 mode. The hand coded + * assembly below forces us into microMIPS mode if we are a pure + * microMIPS kernel. The assembly instructions are: + * + * 3C1A8000 lui k0,0x8000 + * 375A0381 ori k0,k0,0x381 + * 03400008 jr k0 + * 00000000 nop + * + * The mode switch occurs by jumping to the unaligned exception + * vector address at 0x80000381 which would have been 0x80000380 + * in MIPS32 mode. The jump to the unaligned address transitions + * us into microMIPS mode. + */ + if (!cpu_has_veic) { + void *base2 = (void *)(CAC_BASE + 0xa80); + *((unsigned int *)base2) = 0x3c1a8000; + *((unsigned int *)base2 + 1) = 0x375a0381; + *((unsigned int *)base2 + 2) = 0x03400008; + *((unsigned int *)base2 + 3) = 0x00000000; + flush_icache_range((unsigned long)base2, + (unsigned long)base2 + 0x10); + } +#else memcpy(base, &except_vec_nmi, 0x80); +#endif flush_icache_range((unsigned long)base, (unsigned long)base + 0x80); } @@ -63,29 +110,40 @@ static void __init mips_ejtag_setup(void) base = cpu_has_veic ? (void *)(CAC_BASE + 0xa00) : (void *)(CAC_BASE + 0x300); +#ifdef CONFIG_CPU_MICROMIPS + /* Deja vu... */ + memcpy(base, (&except_vec_ejtag_debug - 1), 0x80); + if (!cpu_has_veic) { + void *base2 = (void *)(CAC_BASE + 0xa00); + *((unsigned int *)base2) = 0x3c1a8000; + *((unsigned int *)base2 + 1) = 0x375a0301; + *((unsigned int *)base2 + 2) = 0x03400008; + *((unsigned int *)base2 + 3) = 0x00000000; + flush_icache_range((unsigned long)base2, + (unsigned long)base2 + 0x10); + } +#else memcpy(base, &except_vec_ejtag_debug, 0x80); +#endif flush_icache_range((unsigned long)base, (unsigned long)base + 0x80); } void __init prom_init(void) { - prom_argc = fw_arg0; - _prom_argv = (int *) fw_arg1; - _prom_envp = (int *) fw_arg2; - board_nmi_handler_setup = mips_nmi_setup; board_ejtag_handler_setup = mips_ejtag_setup; - prom_init_cmdline(); + fw_init_cmdline(); #ifdef CONFIG_EARLY_PRINTK - if ((strstr(prom_getcmdline(), "console=ttyS0")) != NULL) - prom_init_early_console(0); - else if ((strstr(prom_getcmdline(), "console=ttyS1")) != NULL) - prom_init_early_console(1); + if ((strstr(fw_getcmdline(), "console=ttyS0")) != NULL) + fw_init_early_console(0); + else if ((strstr(fw_getcmdline(), "console=ttyS1")) != NULL) + fw_init_early_console(1); #endif #ifdef CONFIG_SERIAL_8250_CONSOLE - if ((strstr(prom_getcmdline(), "console=")) == NULL) - strcat(prom_getcmdline(), " console=ttyS0,38400n8r"); + if ((strstr(fw_getcmdline(), "console=")) == NULL) + strcat(fw_getcmdline(), " console=ttyS0,38400n8r"); + console_config(); #endif } diff --git a/arch/mips/mti-sead3/sead3-int.c b/arch/mips/mti-sead3/sead3-int.c index e26e08274fc5..6a560ac03def 100644 --- a/arch/mips/mti-sead3/sead3-int.c +++ b/arch/mips/mti-sead3/sead3-int.c @@ -20,7 +20,6 @@ #define SEAD_CONFIG_BASE 0x1b100110 #define SEAD_CONFIG_SIZE 4 -int gic_present; static unsigned long sead3_config_reg; /* diff --git a/arch/mips/mti-sead3/sead3-setup.c b/arch/mips/mti-sead3/sead3-setup.c index f012fd164cee..b5059dc899f4 100644 --- a/arch/mips/mti-sead3/sead3-setup.c +++ b/arch/mips/mti-sead3/sead3-setup.c @@ -11,10 +11,6 @@ #include #include -#include - -int coherentio; /* 0 => no DMA cache coherency (may be set by user) */ -int hw_coherentio; /* 0 => no HW DMA cache coherency (reflects real HW) */ const char *get_system_type(void) { diff --git a/arch/mips/mti-sead3/sead3-time.c b/arch/mips/mti-sead3/sead3-time.c index 239e4e32757f..96b42eb9b5e2 100644 --- a/arch/mips/mti-sead3/sead3-time.c +++ b/arch/mips/mti-sead3/sead3-time.c @@ -11,7 +11,6 @@ #include #include #include -#include unsigned long cpu_khz; diff --git a/arch/mips/netlogic/Kconfig b/arch/mips/netlogic/Kconfig index 3c05bf9e280a..e0873a31ebaa 100644 --- a/arch/mips/netlogic/Kconfig +++ b/arch/mips/netlogic/Kconfig @@ -2,13 +2,22 @@ if NLM_XLP_BOARD || NLM_XLR_BOARD if NLM_XLP_BOARD config DT_XLP_EVP - bool "Built-in device tree for XLP EVP/SVP boards" + bool "Built-in device tree for XLP EVP boards" default y help - Add an FDT blob for XLP EVP and SVP boards into the kernel. + Add an FDT blob for XLP EVP boards into the kernel. This DTB will be used if the firmware does not pass in a DTB - pointer to the kernel. The corresponding DTS file is at - arch/mips/netlogic/dts/xlp_evp.dts + pointer to the kernel. The corresponding DTS file is at + arch/mips/netlogic/dts/xlp_evp.dts + +config DT_XLP_SVP + bool "Built-in device tree for XLP SVP boards" + default y + help + Add an FDT blob for XLP VP boards into the kernel. + This DTB will be used if the firmware does not pass in a DTB + pointer to the kernel. The corresponding DTS file is at + arch/mips/netlogic/dts/xlp_svp.dts config NLM_MULTINODE bool "Support for multi-chip boards" diff --git a/arch/mips/netlogic/common/smp.c b/arch/mips/netlogic/common/smp.c index 2bb95dcfe20a..ffba52489bef 100644 --- a/arch/mips/netlogic/common/smp.c +++ b/arch/mips/netlogic/common/smp.c @@ -148,8 +148,7 @@ void nlm_cpus_done(void) int nlm_cpu_ready[NR_CPUS]; unsigned long nlm_next_gp; unsigned long nlm_next_sp; - -cpumask_t phys_cpu_present_map; +static cpumask_t phys_cpu_present_mask; void nlm_boot_secondary(int logical_cpu, struct task_struct *idle) { @@ -169,11 +168,12 @@ void __init nlm_smp_setup(void) { unsigned int boot_cpu; int num_cpus, i, ncore; + char buf[64]; boot_cpu = hard_smp_processor_id(); - cpumask_clear(&phys_cpu_present_map); + cpumask_clear(&phys_cpu_present_mask); - cpumask_set_cpu(boot_cpu, &phys_cpu_present_map); + cpumask_set_cpu(boot_cpu, &phys_cpu_present_mask); __cpu_number_map[boot_cpu] = 0; __cpu_logical_map[0] = boot_cpu; set_cpu_possible(0, true); @@ -185,7 +185,7 @@ void __init nlm_smp_setup(void) * it is only set for ASPs (see smpboot.S) */ if (nlm_cpu_ready[i]) { - cpumask_set_cpu(i, &phys_cpu_present_map); + cpumask_set_cpu(i, &phys_cpu_present_mask); __cpu_number_map[i] = num_cpus; __cpu_logical_map[num_cpus] = i; set_cpu_possible(num_cpus, true); @@ -193,16 +193,19 @@ void __init nlm_smp_setup(void) } } + cpumask_scnprintf(buf, ARRAY_SIZE(buf), &phys_cpu_present_mask); + pr_info("Physical CPU mask: %s\n", buf); + cpumask_scnprintf(buf, ARRAY_SIZE(buf), cpu_possible_mask); + pr_info("Possible CPU mask: %s\n", buf); + /* check with the cores we have worken up */ for (ncore = 0, i = 0; i < NLM_NR_NODES; i++) ncore += hweight32(nlm_get_node(i)->coremask); - pr_info("Phys CPU present map: %lx, possible map %lx\n", - (unsigned long)cpumask_bits(&phys_cpu_present_map)[0], - (unsigned long)cpumask_bits(cpu_possible_mask)[0]); - pr_info("Detected (%dc%dt) %d Slave CPU(s)\n", ncore, nlm_threads_per_core, num_cpus); + + /* switch NMI handler to boot CPUs */ nlm_set_nmi_handler(nlm_boot_secondary_cpus); } diff --git a/arch/mips/netlogic/dts/Makefile b/arch/mips/netlogic/dts/Makefile index d117d46413aa..aecb6fa9a9c3 100644 --- a/arch/mips/netlogic/dts/Makefile +++ b/arch/mips/netlogic/dts/Makefile @@ -1 +1,2 @@ obj-$(CONFIG_DT_XLP_EVP) := xlp_evp.dtb.o +obj-$(CONFIG_DT_XLP_SVP) += xlp_svp.dtb.o diff --git a/arch/mips/netlogic/dts/xlp_evp.dts b/arch/mips/netlogic/dts/xlp_evp.dts index 7628b5464fc7..e14f42308064 100644 --- a/arch/mips/netlogic/dts/xlp_evp.dts +++ b/arch/mips/netlogic/dts/xlp_evp.dts @@ -20,7 +20,7 @@ #address-cells = <2>; #size-cells = <1>; compatible = "simple-bus"; - ranges = <0 0 0 0x18000000 0x04000000 // PCIe CFG + ranges = <0 0 0 0x18000000 0x04000000 // PCIe CFG 1 0 0 0x16000000 0x01000000>; // GBU chipselects serial0: serial@30000 { diff --git a/arch/mips/netlogic/dts/xlp_svp.dts b/arch/mips/netlogic/dts/xlp_svp.dts new file mode 100644 index 000000000000..8af4bdbe5d99 --- /dev/null +++ b/arch/mips/netlogic/dts/xlp_svp.dts @@ -0,0 +1,124 @@ +/* + * XLP3XX Device Tree Source for SVP boards + */ + +/dts-v1/; +/ { + model = "netlogic,XLP-SVP"; + compatible = "netlogic,xlp"; + #address-cells = <2>; + #size-cells = <2>; + + memory { + device_type = "memory"; + reg = <0 0x00100000 0 0x0FF00000 // 255M at 1M + 0 0x20000000 0 0xa0000000 // 2560M at 512M + 0 0xe0000000 0 0x40000000>; + }; + + soc { + #address-cells = <2>; + #size-cells = <1>; + compatible = "simple-bus"; + ranges = <0 0 0 0x18000000 0x04000000 // PCIe CFG + 1 0 0 0x16000000 0x01000000>; // GBU chipselects + + serial0: serial@30000 { + device_type = "serial"; + compatible = "ns16550"; + reg = <0 0x30100 0xa00>; + reg-shift = <2>; + reg-io-width = <4>; + clock-frequency = <133333333>; + interrupt-parent = <&pic>; + interrupts = <17>; + }; + serial1: serial@31000 { + device_type = "serial"; + compatible = "ns16550"; + reg = <0 0x31100 0xa00>; + reg-shift = <2>; + reg-io-width = <4>; + clock-frequency = <133333333>; + interrupt-parent = <&pic>; + interrupts = <18>; + }; + i2c0: ocores@32000 { + compatible = "opencores,i2c-ocores"; + #address-cells = <1>; + #size-cells = <0>; + reg = <0 0x32100 0xa00>; + reg-shift = <2>; + reg-io-width = <4>; + clock-frequency = <32000000>; + interrupt-parent = <&pic>; + interrupts = <30>; + }; + i2c1: ocores@33000 { + compatible = "opencores,i2c-ocores"; + #address-cells = <1>; + #size-cells = <0>; + reg = <0 0x33100 0xa00>; + reg-shift = <2>; + reg-io-width = <4>; + clock-frequency = <32000000>; + interrupt-parent = <&pic>; + interrupts = <31>; + + rtc@68 { + compatible = "dallas,ds1374"; + reg = <0x68>; + }; + + dtt@4c { + compatible = "national,lm90"; + reg = <0x4c>; + }; + }; + pic: pic@4000 { + interrupt-controller; + #address-cells = <0>; + #interrupt-cells = <1>; + reg = <0 0x4000 0x200>; + }; + + nor_flash@1,0 { + compatible = "cfi-flash"; + #address-cells = <1>; + #size-cells = <1>; + bank-width = <2>; + reg = <1 0 0x1000000>; + + partition@0 { + label = "x-loader"; + reg = <0x0 0x100000>; /* 1M */ + read-only; + }; + + partition@100000 { + label = "u-boot"; + reg = <0x100000 0x100000>; /* 1M */ + }; + + partition@200000 { + label = "kernel"; + reg = <0x200000 0x500000>; /* 5M */ + }; + + partition@700000 { + label = "rootfs"; + reg = <0x700000 0x800000>; /* 8M */ + }; + + partition@f00000 { + label = "env"; + reg = <0xf00000 0x100000>; /* 1M */ + read-only; + }; + }; + }; + + chosen { + bootargs = "console=ttyS0,115200 rdinit=/sbin/init"; + }; +}; diff --git a/arch/mips/netlogic/xlp/nlm_hal.c b/arch/mips/netlogic/xlp/nlm_hal.c index c68fd4026104..87560e4db35f 100644 --- a/arch/mips/netlogic/xlp/nlm_hal.c +++ b/arch/mips/netlogic/xlp/nlm_hal.c @@ -61,43 +61,61 @@ void nlm_node_init(int node) int nlm_irq_to_irt(int irq) { - if (!PIC_IRQ_IS_IRT(irq)) - return -1; + uint64_t pcibase; + int devoff, irt; switch (irq) { case PIC_UART_0_IRQ: - return PIC_IRT_UART_0_INDEX; + devoff = XLP_IO_UART0_OFFSET(0); + break; case PIC_UART_1_IRQ: - return PIC_IRT_UART_1_INDEX; - case PIC_PCIE_LINK_0_IRQ: - return PIC_IRT_PCIE_LINK_0_INDEX; - case PIC_PCIE_LINK_1_IRQ: - return PIC_IRT_PCIE_LINK_1_INDEX; - case PIC_PCIE_LINK_2_IRQ: - return PIC_IRT_PCIE_LINK_2_INDEX; - case PIC_PCIE_LINK_3_IRQ: - return PIC_IRT_PCIE_LINK_3_INDEX; + devoff = XLP_IO_UART1_OFFSET(0); + break; case PIC_EHCI_0_IRQ: - return PIC_IRT_EHCI_0_INDEX; + devoff = XLP_IO_USB_EHCI0_OFFSET(0); + break; case PIC_EHCI_1_IRQ: - return PIC_IRT_EHCI_1_INDEX; + devoff = XLP_IO_USB_EHCI1_OFFSET(0); + break; case PIC_OHCI_0_IRQ: - return PIC_IRT_OHCI_0_INDEX; + devoff = XLP_IO_USB_OHCI0_OFFSET(0); + break; case PIC_OHCI_1_IRQ: - return PIC_IRT_OHCI_1_INDEX; + devoff = XLP_IO_USB_OHCI1_OFFSET(0); + break; case PIC_OHCI_2_IRQ: - return PIC_IRT_OHCI_2_INDEX; + devoff = XLP_IO_USB_OHCI2_OFFSET(0); + break; case PIC_OHCI_3_IRQ: - return PIC_IRT_OHCI_3_INDEX; + devoff = XLP_IO_USB_OHCI3_OFFSET(0); + break; case PIC_MMC_IRQ: - return PIC_IRT_MMC_INDEX; + devoff = XLP_IO_SD_OFFSET(0); + break; case PIC_I2C_0_IRQ: - return PIC_IRT_I2C_0_INDEX; + devoff = XLP_IO_I2C0_OFFSET(0); + break; case PIC_I2C_1_IRQ: - return PIC_IRT_I2C_1_INDEX; + devoff = XLP_IO_I2C1_OFFSET(0); + break; default: - return -1; + devoff = 0; + break; } + + if (devoff != 0) { + pcibase = nlm_pcicfg_base(devoff); + irt = nlm_read_reg(pcibase, XLP_PCI_IRTINFO_REG) & 0xffff; + /* HW bug, I2C 1 irt entry is off by one */ + if (irq == PIC_I2C_1_IRQ) + irt = irt + 1; + } else if (irq >= PIC_PCIE_LINK_0_IRQ && irq <= PIC_PCIE_LINK_3_IRQ) { + /* HW bug, PCI IRT entries are bad on early silicon, fix */ + irt = PIC_IRT_PCIE_LINK_INDEX(irq - PIC_PCIE_LINK_0_IRQ); + } else { + irt = -1; + } + return irt; } unsigned int nlm_get_core_frequency(int node, int core) diff --git a/arch/mips/netlogic/xlp/setup.c b/arch/mips/netlogic/xlp/setup.c index 4894d62043ac..af319143b591 100644 --- a/arch/mips/netlogic/xlp/setup.c +++ b/arch/mips/netlogic/xlp/setup.c @@ -56,7 +56,7 @@ uint64_t nlm_io_base; struct nlm_soc_info nlm_nodes[NLM_NR_NODES]; cpumask_t nlm_cpumask = CPU_MASK_CPU0; unsigned int nlm_threads_per_core; -extern u32 __dtb_start[]; +extern u32 __dtb_xlp_evp_begin[], __dtb_xlp_svp_begin[], __dtb_start[]; static void nlm_linux_exit(void) { @@ -82,8 +82,24 @@ void __init plat_mem_setup(void) * 64-bit, so convert pointer. */ fdtp = (void *)(long)fw_arg0; - if (!fdtp) - fdtp = __dtb_start; + if (!fdtp) { + switch (current_cpu_data.processor_id & 0xff00) { +#ifdef CONFIG_DT_XLP_SVP + case PRID_IMP_NETLOGIC_XLP3XX: + fdtp = __dtb_xlp_svp_begin; + break; +#endif +#ifdef CONFIG_DT_XLP_EVP + case PRID_IMP_NETLOGIC_XLP8XX: + fdtp = __dtb_xlp_evp_begin; + break; +#endif + default: + /* Pick a built-in if any, and hope for the best */ + fdtp = __dtb_start; + break; + } + } fdtp = phys_to_virt(__pa(fdtp)); early_init_devtree(fdtp); } diff --git a/arch/mips/netlogic/xlp/usb-init.c b/arch/mips/netlogic/xlp/usb-init.c index 1d0b66c62fd1..9c401dd78337 100644 --- a/arch/mips/netlogic/xlp/usb-init.c +++ b/arch/mips/netlogic/xlp/usb-init.c @@ -42,7 +42,30 @@ #include #include #include -#include + +/* + * USB glue logic registers, used only during initialization + */ +#define USB_CTL_0 0x01 +#define USB_PHY_0 0x0A +#define USB_PHY_RESET 0x01 +#define USB_PHY_PORT_RESET_0 0x10 +#define USB_PHY_PORT_RESET_1 0x20 +#define USB_CONTROLLER_RESET 0x01 +#define USB_INT_STATUS 0x0E +#define USB_INT_EN 0x0F +#define USB_PHY_INTERRUPT_EN 0x01 +#define USB_OHCI_INTERRUPT_EN 0x02 +#define USB_OHCI_INTERRUPT1_EN 0x04 +#define USB_OHCI_INTERRUPT2_EN 0x08 +#define USB_CTRL_INTERRUPT_EN 0x10 + +#define nlm_read_usb_reg(b, r) nlm_read_reg(b, r) +#define nlm_write_usb_reg(b, r, v) nlm_write_reg(b, r, v) +#define nlm_get_usb_pcibase(node, inst) \ + nlm_pcicfg_base(XLP_IO_USB_OFFSET(node, inst)) +#define nlm_get_usb_regbase(node, inst) \ + (nlm_get_usb_pcibase(node, inst) + XLP_IO_PCI_HDRSZ) static void nlm_usb_intr_en(int node, int port) { @@ -99,23 +122,23 @@ static void nlm_usb_fixup_final(struct pci_dev *dev) dev->dev.coherent_dma_mask = DMA_BIT_MASK(64); switch (dev->devfn) { case 0x10: - dev->irq = PIC_EHCI_0_IRQ; - break; + dev->irq = PIC_EHCI_0_IRQ; + break; case 0x11: - dev->irq = PIC_OHCI_0_IRQ; - break; + dev->irq = PIC_OHCI_0_IRQ; + break; case 0x12: - dev->irq = PIC_OHCI_1_IRQ; - break; + dev->irq = PIC_OHCI_1_IRQ; + break; case 0x13: - dev->irq = PIC_EHCI_1_IRQ; - break; + dev->irq = PIC_EHCI_1_IRQ; + break; case 0x14: - dev->irq = PIC_OHCI_2_IRQ; - break; + dev->irq = PIC_OHCI_2_IRQ; + break; case 0x15: - dev->irq = PIC_OHCI_3_IRQ; - break; + dev->irq = PIC_OHCI_3_IRQ; + break; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_NETLOGIC, PCI_DEVICE_ID_NLM_EHCI, diff --git a/arch/mips/oprofile/op_model_mipsxx.c b/arch/mips/oprofile/op_model_mipsxx.c index 1fd361462c03..e4b1140cdae0 100644 --- a/arch/mips/oprofile/op_model_mipsxx.c +++ b/arch/mips/oprofile/op_model_mipsxx.c @@ -41,7 +41,7 @@ static int (*save_perf_irq)(void); * first hardware thread in the core for setup and init. * Skip CPUs with non-zero hardware thread id (4 hwt per core) */ -#ifdef CONFIG_CPU_XLR +#if defined(CONFIG_CPU_XLR) && defined(CONFIG_SMP) #define oprofile_skip_cpu(c) ((cpu_logical_map(c) & 0x3) != 0) #else #define oprofile_skip_cpu(c) 0 diff --git a/arch/mips/pci/pci-ar71xx.c b/arch/mips/pci/pci-ar71xx.c index 412ec025cf55..18517dd0f709 100644 --- a/arch/mips/pci/pci-ar71xx.c +++ b/arch/mips/pci/pci-ar71xx.c @@ -366,9 +366,9 @@ static int ar71xx_pci_probe(struct platform_device *pdev) if (!res) return -EINVAL; - apc->cfg_base = devm_request_and_ioremap(&pdev->dev, res); - if (!apc->cfg_base) - return -ENOMEM; + apc->cfg_base = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(apc->cfg_base)) + return PTR_ERR(apc->cfg_base); apc->irq = platform_get_irq(pdev, 0); if (apc->irq < 0) diff --git a/arch/mips/pci/pci-ar724x.c b/arch/mips/pci/pci-ar724x.c index 8a0700d448fe..65ec032fa0b4 100644 --- a/arch/mips/pci/pci-ar724x.c +++ b/arch/mips/pci/pci-ar724x.c @@ -365,25 +365,25 @@ static int ar724x_pci_probe(struct platform_device *pdev) if (!res) return -EINVAL; - apc->ctrl_base = devm_request_and_ioremap(&pdev->dev, res); - if (apc->ctrl_base == NULL) - return -EBUSY; + apc->ctrl_base = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(apc->ctrl_base)) + return PTR_ERR(apc->ctrl_base); res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "cfg_base"); if (!res) return -EINVAL; - apc->devcfg_base = devm_request_and_ioremap(&pdev->dev, res); - if (!apc->devcfg_base) - return -EBUSY; + apc->devcfg_base = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(apc->devcfg_base)) + return PTR_ERR(apc->devcfg_base); res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "crp_base"); if (!res) return -EINVAL; - apc->crp_base = devm_request_and_ioremap(&pdev->dev, res); - if (apc->crp_base == NULL) - return -EBUSY; + apc->crp_base = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(apc->crp_base)) + return PTR_ERR(apc->crp_base); apc->irq = platform_get_irq(pdev, 0); if (apc->irq < 0) diff --git a/arch/mips/pci/pci-bcm63xx.c b/arch/mips/pci/pci-bcm63xx.c index 88e781c6b5ba..2eb954239bc5 100644 --- a/arch/mips/pci/pci-bcm63xx.c +++ b/arch/mips/pci/pci-bcm63xx.c @@ -121,11 +121,17 @@ void __iomem *pci_iospace_start; static void __init bcm63xx_reset_pcie(void) { u32 val; + u32 reg; /* enable SERDES */ - val = bcm_misc_readl(MISC_SERDES_CTRL_REG); + if (BCMCPU_IS_6328()) + reg = MISC_SERDES_CTRL_6328_REG; + else + reg = MISC_SERDES_CTRL_6362_REG; + + val = bcm_misc_readl(reg); val |= SERDES_PCIE_EN | SERDES_PCIE_EXD_EN; - bcm_misc_writel(val, MISC_SERDES_CTRL_REG); + bcm_misc_writel(val, reg); /* reset the PCIe core */ bcm63xx_core_set_reset(BCM63XX_RESET_PCIE, 1); @@ -330,6 +336,7 @@ static int __init bcm63xx_pci_init(void) switch (bcm63xx_get_cpu_id()) { case BCM6328_CPU_ID: + case BCM6362_CPU_ID: return bcm63xx_register_pcie(); case BCM6348_CPU_ID: case BCM6358_CPU_ID: diff --git a/arch/mips/powertv/init.c b/arch/mips/powertv/init.c index 5bd9d8f468cc..a01baff52cae 100644 --- a/arch/mips/powertv/init.c +++ b/arch/mips/powertv/init.c @@ -29,10 +29,11 @@ #include #include -#include #include #include +#include "init.h" + static int *_prom_envp; unsigned long _prom_memsize; diff --git a/arch/mips/powertv/init.h b/arch/mips/powertv/init.h index b194c34ca966..c1a8bd0dbe4b 100644 --- a/arch/mips/powertv/init.h +++ b/arch/mips/powertv/init.h @@ -23,4 +23,6 @@ #ifndef _POWERTV_INIT_H #define _POWERTV_INIT_H extern unsigned long _prom_memsize; +extern void prom_meminit(void); +extern char *prom_getenv(char *name); #endif diff --git a/arch/mips/powertv/memory.c b/arch/mips/powertv/memory.c index 6e5f1bdc59b5..bc2f3ca22b41 100644 --- a/arch/mips/powertv/memory.c +++ b/arch/mips/powertv/memory.c @@ -29,7 +29,6 @@ #include #include -#include #include #include diff --git a/arch/mips/powertv/powertv_setup.c b/arch/mips/powertv/powertv_setup.c index 820b8480f222..24689bff1039 100644 --- a/arch/mips/powertv/powertv_setup.c +++ b/arch/mips/powertv/powertv_setup.c @@ -31,7 +31,6 @@ #include #include #include -#include #include #include #include diff --git a/arch/mips/ralink/Kconfig b/arch/mips/ralink/Kconfig index a0b0197cab0a..026e823d871d 100644 --- a/arch/mips/ralink/Kconfig +++ b/arch/mips/ralink/Kconfig @@ -6,12 +6,23 @@ choice help Select Ralink MIPS SoC type. + config SOC_RT288X + bool "RT288x" + config SOC_RT305X bool "RT305x" select USB_ARCH_HAS_HCD select USB_ARCH_HAS_OHCI select USB_ARCH_HAS_EHCI + config SOC_RT3883 + bool "RT3883" + select USB_ARCH_HAS_OHCI + select USB_ARCH_HAS_EHCI + + config SOC_MT7620 + bool "MT7620" + endchoice choice @@ -23,10 +34,22 @@ choice config DTB_RT_NONE bool "None" + config DTB_RT2880_EVAL + bool "RT2880 eval kit" + depends on SOC_RT288X + config DTB_RT305X_EVAL bool "RT305x eval kit" depends on SOC_RT305X + config DTB_RT3883_EVAL + bool "RT3883 eval kit" + depends on SOC_RT3883 + + config DTB_MT7620A_EVAL + bool "MT7620A eval kit" + depends on SOC_MT7620 + endchoice endif diff --git a/arch/mips/ralink/Makefile b/arch/mips/ralink/Makefile index 939757f0e71f..38cf1a880aaa 100644 --- a/arch/mips/ralink/Makefile +++ b/arch/mips/ralink/Makefile @@ -8,7 +8,10 @@ obj-y := prom.o of.o reset.o clk.o irq.o +obj-$(CONFIG_SOC_RT288X) += rt288x.o obj-$(CONFIG_SOC_RT305X) += rt305x.o +obj-$(CONFIG_SOC_RT3883) += rt3883.o +obj-$(CONFIG_SOC_MT7620) += mt7620.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o diff --git a/arch/mips/ralink/Platform b/arch/mips/ralink/Platform index 6babd65765e6..cda4b6645c50 100644 --- a/arch/mips/ralink/Platform +++ b/arch/mips/ralink/Platform @@ -4,7 +4,25 @@ core-$(CONFIG_RALINK) += arch/mips/ralink/ cflags-$(CONFIG_RALINK) += -I$(srctree)/arch/mips/include/asm/mach-ralink +# +# Ralink RT288x +# +load-$(CONFIG_SOC_RT288X) += 0xffffffff88000000 +cflags-$(CONFIG_SOC_RT288X) += -I$(srctree)/arch/mips/include/asm/mach-ralink/rt288x + # # Ralink RT305x # load-$(CONFIG_SOC_RT305X) += 0xffffffff80000000 +cflags-$(CONFIG_SOC_RT305X) += -I$(srctree)/arch/mips/include/asm/mach-ralink/rt305x + +# +# Ralink RT3883 +# +load-$(CONFIG_SOC_RT3883) += 0xffffffff80000000 +cflags-$(CONFIG_SOC_RT3883) += -I$(srctree)/arch/mips/include/asm/mach-ralink/rt3883 + +# +# Ralink MT7620 +# +load-$(CONFIG_SOC_MT7620) += 0xffffffff80000000 diff --git a/arch/mips/ralink/common.h b/arch/mips/ralink/common.h index 300990313e1b..83144c3fc5ac 100644 --- a/arch/mips/ralink/common.h +++ b/arch/mips/ralink/common.h @@ -22,13 +22,22 @@ struct ralink_pinmux { struct ralink_pinmux_grp *mode; struct ralink_pinmux_grp *uart; int uart_shift; + u32 uart_mask; void (*wdt_reset)(void); + struct ralink_pinmux_grp *pci; + int pci_shift; + u32 pci_mask; }; -extern struct ralink_pinmux gpio_pinmux; +extern struct ralink_pinmux rt_gpio_pinmux; struct ralink_soc_info { unsigned char sys_type[RAMIPS_SYS_TYPE_LEN]; unsigned char *compatible; + + unsigned long mem_base; + unsigned long mem_size; + unsigned long mem_size_min; + unsigned long mem_size_max; }; extern struct ralink_soc_info soc_info; diff --git a/arch/mips/ralink/dts/Makefile b/arch/mips/ralink/dts/Makefile index 1a69fb300955..18194fa93e80 100644 --- a/arch/mips/ralink/dts/Makefile +++ b/arch/mips/ralink/dts/Makefile @@ -1 +1,4 @@ +obj-$(CONFIG_DTB_RT2880_EVAL) := rt2880_eval.dtb.o obj-$(CONFIG_DTB_RT305X_EVAL) := rt3052_eval.dtb.o +obj-$(CONFIG_DTB_RT3883_EVAL) := rt3883_eval.dtb.o +obj-$(CONFIG_DTB_MT7620A_EVAL) := mt7620a_eval.dtb.o diff --git a/arch/mips/ralink/dts/mt7620a.dtsi b/arch/mips/ralink/dts/mt7620a.dtsi new file mode 100644 index 000000000000..08bf24fefe9f --- /dev/null +++ b/arch/mips/ralink/dts/mt7620a.dtsi @@ -0,0 +1,58 @@ +/ { + #address-cells = <1>; + #size-cells = <1>; + compatible = "ralink,mtk7620a-soc"; + + cpus { + cpu@0 { + compatible = "mips,mips24KEc"; + }; + }; + + cpuintc: cpuintc@0 { + #address-cells = <0>; + #interrupt-cells = <1>; + interrupt-controller; + compatible = "mti,cpu-interrupt-controller"; + }; + + palmbus@10000000 { + compatible = "palmbus"; + reg = <0x10000000 0x200000>; + ranges = <0x0 0x10000000 0x1FFFFF>; + + #address-cells = <1>; + #size-cells = <1>; + + sysc@0 { + compatible = "ralink,mt7620a-sysc"; + reg = <0x0 0x100>; + }; + + intc: intc@200 { + compatible = "ralink,mt7620a-intc", "ralink,rt2880-intc"; + reg = <0x200 0x100>; + + interrupt-controller; + #interrupt-cells = <1>; + + interrupt-parent = <&cpuintc>; + interrupts = <2>; + }; + + memc@300 { + compatible = "ralink,mt7620a-memc", "ralink,rt3050-memc"; + reg = <0x300 0x100>; + }; + + uartlite@c00 { + compatible = "ralink,mt7620a-uart", "ralink,rt2880-uart", "ns16550a"; + reg = <0xc00 0x100>; + + interrupt-parent = <&intc>; + interrupts = <12>; + + reg-shift = <2>; + }; + }; +}; diff --git a/arch/mips/ralink/dts/mt7620a_eval.dts b/arch/mips/ralink/dts/mt7620a_eval.dts new file mode 100644 index 000000000000..35eb874ab7f1 --- /dev/null +++ b/arch/mips/ralink/dts/mt7620a_eval.dts @@ -0,0 +1,16 @@ +/dts-v1/; + +/include/ "mt7620a.dtsi" + +/ { + compatible = "ralink,mt7620a-eval-board", "ralink,mt7620a-soc"; + model = "Ralink MT7620A evaluation board"; + + memory@0 { + reg = <0x0 0x2000000>; + }; + + chosen { + bootargs = "console=ttyS0,57600"; + }; +}; diff --git a/arch/mips/ralink/dts/rt2880.dtsi b/arch/mips/ralink/dts/rt2880.dtsi new file mode 100644 index 000000000000..182afde2f2e1 --- /dev/null +++ b/arch/mips/ralink/dts/rt2880.dtsi @@ -0,0 +1,58 @@ +/ { + #address-cells = <1>; + #size-cells = <1>; + compatible = "ralink,rt2880-soc"; + + cpus { + cpu@0 { + compatible = "mips,mips4KEc"; + }; + }; + + cpuintc: cpuintc@0 { + #address-cells = <0>; + #interrupt-cells = <1>; + interrupt-controller; + compatible = "mti,cpu-interrupt-controller"; + }; + + palmbus@300000 { + compatible = "palmbus"; + reg = <0x300000 0x200000>; + ranges = <0x0 0x300000 0x1FFFFF>; + + #address-cells = <1>; + #size-cells = <1>; + + sysc@0 { + compatible = "ralink,rt2880-sysc"; + reg = <0x0 0x100>; + }; + + intc: intc@200 { + compatible = "ralink,rt2880-intc"; + reg = <0x200 0x100>; + + interrupt-controller; + #interrupt-cells = <1>; + + interrupt-parent = <&cpuintc>; + interrupts = <2>; + }; + + memc@300 { + compatible = "ralink,rt2880-memc"; + reg = <0x300 0x100>; + }; + + uartlite@c00 { + compatible = "ralink,rt2880-uart", "ns16550a"; + reg = <0xc00 0x100>; + + interrupt-parent = <&intc>; + interrupts = <8>; + + reg-shift = <2>; + }; + }; +}; diff --git a/arch/mips/ralink/dts/rt2880_eval.dts b/arch/mips/ralink/dts/rt2880_eval.dts new file mode 100644 index 000000000000..322d7002595b --- /dev/null +++ b/arch/mips/ralink/dts/rt2880_eval.dts @@ -0,0 +1,46 @@ +/dts-v1/; + +/include/ "rt2880.dtsi" + +/ { + compatible = "ralink,rt2880-eval-board", "ralink,rt2880-soc"; + model = "Ralink RT2880 evaluation board"; + + memory@0 { + reg = <0x8000000 0x2000000>; + }; + + chosen { + bootargs = "console=ttyS0,57600"; + }; + + cfi@1f000000 { + compatible = "cfi-flash"; + reg = <0x1f000000 0x400000>; + + bank-width = <2>; + device-width = <2>; + #address-cells = <1>; + #size-cells = <1>; + + partition@0 { + label = "uboot"; + reg = <0x0 0x30000>; + read-only; + }; + partition@30000 { + label = "uboot-env"; + reg = <0x30000 0x10000>; + read-only; + }; + partition@40000 { + label = "calibration"; + reg = <0x40000 0x10000>; + read-only; + }; + partition@50000 { + label = "linux"; + reg = <0x50000 0x3b0000>; + }; + }; +}; diff --git a/arch/mips/ralink/dts/rt3050.dtsi b/arch/mips/ralink/dts/rt3050.dtsi index 069d0660e1dd..ef7da1e227e6 100644 --- a/arch/mips/ralink/dts/rt3050.dtsi +++ b/arch/mips/ralink/dts/rt3050.dtsi @@ -1,7 +1,7 @@ / { #address-cells = <1>; #size-cells = <1>; - compatible = "ralink,rt3050-soc", "ralink,rt3052-soc"; + compatible = "ralink,rt3050-soc", "ralink,rt3052-soc", "ralink,rt3350-soc"; cpus { cpu@0 { @@ -9,10 +9,6 @@ }; }; - chosen { - bootargs = "console=ttyS0,57600 init=/init"; - }; - cpuintc: cpuintc@0 { #address-cells = <0>; #interrupt-cells = <1>; @@ -23,7 +19,7 @@ palmbus@10000000 { compatible = "palmbus"; reg = <0x10000000 0x200000>; - ranges = <0x0 0x10000000 0x1FFFFF>; + ranges = <0x0 0x10000000 0x1FFFFF>; #address-cells = <1>; #size-cells = <1>; @@ -33,11 +29,6 @@ reg = <0x0 0x100>; }; - timer@100 { - compatible = "ralink,rt3052-wdt", "ralink,rt2880-wdt"; - reg = <0x100 0x100>; - }; - intc: intc@200 { compatible = "ralink,rt3052-intc", "ralink,rt2880-intc"; reg = <0x200 0x100>; @@ -54,45 +45,6 @@ reg = <0x300 0x100>; }; - gpio0: gpio@600 { - compatible = "ralink,rt3052-gpio", "ralink,rt2880-gpio"; - reg = <0x600 0x34>; - - gpio-controller; - #gpio-cells = <2>; - - ralink,ngpio = <24>; - ralink,regs = [ 00 04 08 0c - 20 24 28 2c - 30 34 ]; - }; - - gpio1: gpio@638 { - compatible = "ralink,rt3052-gpio", "ralink,rt2880-gpio"; - reg = <0x638 0x24>; - - gpio-controller; - #gpio-cells = <2>; - - ralink,ngpio = <16>; - ralink,regs = [ 00 04 08 0c - 10 14 18 1c - 20 24 ]; - }; - - gpio2: gpio@660 { - compatible = "ralink,rt3052-gpio", "ralink,rt2880-gpio"; - reg = <0x660 0x24>; - - gpio-controller; - #gpio-cells = <2>; - - ralink,ngpio = <12>; - ralink,regs = [ 00 04 08 0c - 10 14 18 1c - 20 24 ]; - }; - uartlite@c00 { compatible = "ralink,rt3052-uart", "ralink,rt2880-uart", "ns16550a"; reg = <0xc00 0x100>; diff --git a/arch/mips/ralink/dts/rt3052_eval.dts b/arch/mips/ralink/dts/rt3052_eval.dts index 148a590bc419..c18c9a84f4c4 100644 --- a/arch/mips/ralink/dts/rt3052_eval.dts +++ b/arch/mips/ralink/dts/rt3052_eval.dts @@ -1,10 +1,8 @@ /dts-v1/; -/include/ "rt3050.dtsi" +#include "rt3050.dtsi" / { - #address-cells = <1>; - #size-cells = <1>; compatible = "ralink,rt3052-eval-board", "ralink,rt3052-soc"; model = "Ralink RT3052 evaluation board"; @@ -12,12 +10,8 @@ reg = <0x0 0x2000000>; }; - palmbus@10000000 { - sysc@0 { - ralink,pinmmux = "uartlite", "spi"; - ralink,uartmux = "gpio"; - ralink,wdtmux = <0>; - }; + chosen { + bootargs = "console=ttyS0,57600"; }; cfi@1f000000 { diff --git a/arch/mips/ralink/dts/rt3883.dtsi b/arch/mips/ralink/dts/rt3883.dtsi new file mode 100644 index 000000000000..3b131dd0d5ac --- /dev/null +++ b/arch/mips/ralink/dts/rt3883.dtsi @@ -0,0 +1,58 @@ +/ { + #address-cells = <1>; + #size-cells = <1>; + compatible = "ralink,rt3883-soc"; + + cpus { + cpu@0 { + compatible = "mips,mips74Kc"; + }; + }; + + cpuintc: cpuintc@0 { + #address-cells = <0>; + #interrupt-cells = <1>; + interrupt-controller; + compatible = "mti,cpu-interrupt-controller"; + }; + + palmbus@10000000 { + compatible = "palmbus"; + reg = <0x10000000 0x200000>; + ranges = <0x0 0x10000000 0x1FFFFF>; + + #address-cells = <1>; + #size-cells = <1>; + + sysc@0 { + compatible = "ralink,rt3883-sysc", "ralink,rt3050-sysc"; + reg = <0x0 0x100>; + }; + + intc: intc@200 { + compatible = "ralink,rt3883-intc", "ralink,rt2880-intc"; + reg = <0x200 0x100>; + + interrupt-controller; + #interrupt-cells = <1>; + + interrupt-parent = <&cpuintc>; + interrupts = <2>; + }; + + memc@300 { + compatible = "ralink,rt3883-memc", "ralink,rt3050-memc"; + reg = <0x300 0x100>; + }; + + uartlite@c00 { + compatible = "ralink,rt3883-uart", "ralink,rt2880-uart", "ns16550a"; + reg = <0xc00 0x100>; + + interrupt-parent = <&intc>; + interrupts = <12>; + + reg-shift = <2>; + }; + }; +}; diff --git a/arch/mips/ralink/dts/rt3883_eval.dts b/arch/mips/ralink/dts/rt3883_eval.dts new file mode 100644 index 000000000000..2fa6b330bf4f --- /dev/null +++ b/arch/mips/ralink/dts/rt3883_eval.dts @@ -0,0 +1,16 @@ +/dts-v1/; + +/include/ "rt3883.dtsi" + +/ { + compatible = "ralink,rt3883-eval-board", "ralink,rt3883-soc"; + model = "Ralink RT3883 evaluation board"; + + memory@0 { + reg = <0x0 0x2000000>; + }; + + chosen { + bootargs = "console=ttyS0,57600"; + }; +}; diff --git a/arch/mips/ralink/early_printk.c b/arch/mips/ralink/early_printk.c index c4ae47eb24ab..b46d0419d09b 100644 --- a/arch/mips/ralink/early_printk.c +++ b/arch/mips/ralink/early_printk.c @@ -11,7 +11,11 @@ #include +#ifdef CONFIG_SOC_RT288X +#define EARLY_UART_BASE 0x300c00 +#else #define EARLY_UART_BASE 0x10000c00 +#endif #define UART_REG_RX 0x00 #define UART_REG_TX 0x04 diff --git a/arch/mips/ralink/irq.c b/arch/mips/ralink/irq.c index 6d054c5ec9ab..320b1f1043ff 100644 --- a/arch/mips/ralink/irq.c +++ b/arch/mips/ralink/irq.c @@ -31,6 +31,7 @@ #define INTC_INT_GLOBAL BIT(31) #define RALINK_CPU_IRQ_INTC (MIPS_CPU_IRQ_BASE + 2) +#define RALINK_CPU_IRQ_PCI (MIPS_CPU_IRQ_BASE + 4) #define RALINK_CPU_IRQ_FE (MIPS_CPU_IRQ_BASE + 5) #define RALINK_CPU_IRQ_WIFI (MIPS_CPU_IRQ_BASE + 6) #define RALINK_CPU_IRQ_COUNTER (MIPS_CPU_IRQ_BASE + 7) @@ -104,6 +105,9 @@ asmlinkage void plat_irq_dispatch(void) else if (pending & STATUSF_IP6) do_IRQ(RALINK_CPU_IRQ_WIFI); + else if (pending & STATUSF_IP4) + do_IRQ(RALINK_CPU_IRQ_PCI); + else if (pending & STATUSF_IP2) do_IRQ(RALINK_CPU_IRQ_INTC); @@ -162,6 +166,7 @@ static int __init intc_of_init(struct device_node *node, irq_set_chained_handler(irq, ralink_intc_irq_handler); irq_set_handler_data(irq, domain); + /* tell the kernel which irq is used for performance monitoring */ cp0_perfcount_irq = irq_create_mapping(domain, 9); return 0; diff --git a/arch/mips/ralink/mt7620.c b/arch/mips/ralink/mt7620.c new file mode 100644 index 000000000000..0018b1a661f6 --- /dev/null +++ b/arch/mips/ralink/mt7620.c @@ -0,0 +1,234 @@ +/* + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * Parts of this file are based on Ralink's 2.6.21 BSP + * + * Copyright (C) 2008-2011 Gabor Juhos + * Copyright (C) 2008 Imre Kaloz + * Copyright (C) 2013 John Crispin + */ + +#include +#include +#include + +#include +#include +#include + +#include "common.h" + +/* does the board have sdram or ddram */ +static int dram_type; + +/* the pll dividers */ +static u32 mt7620_clk_divider[] = { 2, 3, 4, 8 }; + +static struct ralink_pinmux_grp mode_mux[] = { + { + .name = "i2c", + .mask = MT7620_GPIO_MODE_I2C, + .gpio_first = 1, + .gpio_last = 2, + }, { + .name = "spi", + .mask = MT7620_GPIO_MODE_SPI, + .gpio_first = 3, + .gpio_last = 6, + }, { + .name = "uartlite", + .mask = MT7620_GPIO_MODE_UART1, + .gpio_first = 15, + .gpio_last = 16, + }, { + .name = "wdt", + .mask = MT7620_GPIO_MODE_WDT, + .gpio_first = 17, + .gpio_last = 17, + }, { + .name = "mdio", + .mask = MT7620_GPIO_MODE_MDIO, + .gpio_first = 22, + .gpio_last = 23, + }, { + .name = "rgmii1", + .mask = MT7620_GPIO_MODE_RGMII1, + .gpio_first = 24, + .gpio_last = 35, + }, { + .name = "spi refclk", + .mask = MT7620_GPIO_MODE_SPI_REF_CLK, + .gpio_first = 37, + .gpio_last = 39, + }, { + .name = "jtag", + .mask = MT7620_GPIO_MODE_JTAG, + .gpio_first = 40, + .gpio_last = 44, + }, { + /* shared lines with jtag */ + .name = "ephy", + .mask = MT7620_GPIO_MODE_EPHY, + .gpio_first = 40, + .gpio_last = 44, + }, { + .name = "nand", + .mask = MT7620_GPIO_MODE_JTAG, + .gpio_first = 45, + .gpio_last = 59, + }, { + .name = "rgmii2", + .mask = MT7620_GPIO_MODE_RGMII2, + .gpio_first = 60, + .gpio_last = 71, + }, { + .name = "wled", + .mask = MT7620_GPIO_MODE_WLED, + .gpio_first = 72, + .gpio_last = 72, + }, {0} +}; + +static struct ralink_pinmux_grp uart_mux[] = { + { + .name = "uartf", + .mask = MT7620_GPIO_MODE_UARTF, + .gpio_first = 7, + .gpio_last = 14, + }, { + .name = "pcm uartf", + .mask = MT7620_GPIO_MODE_PCM_UARTF, + .gpio_first = 7, + .gpio_last = 14, + }, { + .name = "pcm i2s", + .mask = MT7620_GPIO_MODE_PCM_I2S, + .gpio_first = 7, + .gpio_last = 14, + }, { + .name = "i2s uartf", + .mask = MT7620_GPIO_MODE_I2S_UARTF, + .gpio_first = 7, + .gpio_last = 14, + }, { + .name = "pcm gpio", + .mask = MT7620_GPIO_MODE_PCM_GPIO, + .gpio_first = 11, + .gpio_last = 14, + }, { + .name = "gpio uartf", + .mask = MT7620_GPIO_MODE_GPIO_UARTF, + .gpio_first = 7, + .gpio_last = 10, + }, { + .name = "gpio i2s", + .mask = MT7620_GPIO_MODE_GPIO_I2S, + .gpio_first = 7, + .gpio_last = 10, + }, { + .name = "gpio", + .mask = MT7620_GPIO_MODE_GPIO, + }, {0} +}; + +struct ralink_pinmux rt_gpio_pinmux = { + .mode = mode_mux, + .uart = uart_mux, + .uart_shift = MT7620_GPIO_MODE_UART0_SHIFT, + .uart_mask = MT7620_GPIO_MODE_UART0_MASK, +}; + +void __init ralink_clk_init(void) +{ + unsigned long cpu_rate, sys_rate; + u32 c0 = rt_sysc_r32(SYSC_REG_CPLL_CONFIG0); + u32 c1 = rt_sysc_r32(SYSC_REG_CPLL_CONFIG1); + u32 swconfig = (c0 >> CPLL_SW_CONFIG_SHIFT) & CPLL_SW_CONFIG_MASK; + u32 cpu_clk = (c1 >> CPLL_CPU_CLK_SHIFT) & CPLL_CPU_CLK_MASK; + + if (cpu_clk) { + cpu_rate = 480000000; + } else if (!swconfig) { + cpu_rate = 600000000; + } else { + u32 m = (c0 >> CPLL_MULT_RATIO_SHIFT) & CPLL_MULT_RATIO; + u32 d = (c0 >> CPLL_DIV_RATIO_SHIFT) & CPLL_DIV_RATIO; + + cpu_rate = ((40 * (m + 24)) / mt7620_clk_divider[d]) * 1000000; + } + + if (dram_type == SYSCFG0_DRAM_TYPE_SDRAM) + sys_rate = cpu_rate / 4; + else + sys_rate = cpu_rate / 3; + + ralink_clk_add("cpu", cpu_rate); + ralink_clk_add("10000100.timer", 40000000); + ralink_clk_add("10000500.uart", 40000000); + ralink_clk_add("10000c00.uartlite", 40000000); +} + +void __init ralink_of_remap(void) +{ + rt_sysc_membase = plat_of_remap_node("ralink,mt7620a-sysc"); + rt_memc_membase = plat_of_remap_node("ralink,mt7620a-memc"); + + if (!rt_sysc_membase || !rt_memc_membase) + panic("Failed to remap core resources"); +} + +void prom_soc_init(struct ralink_soc_info *soc_info) +{ + void __iomem *sysc = (void __iomem *) KSEG1ADDR(MT7620_SYSC_BASE); + unsigned char *name = NULL; + u32 n0; + u32 n1; + u32 rev; + u32 cfg0; + + n0 = __raw_readl(sysc + SYSC_REG_CHIP_NAME0); + n1 = __raw_readl(sysc + SYSC_REG_CHIP_NAME1); + + if (n0 == MT7620N_CHIP_NAME0 && n1 == MT7620N_CHIP_NAME1) { + name = "MT7620N"; + soc_info->compatible = "ralink,mt7620n-soc"; + } else if (n0 == MT7620A_CHIP_NAME0 && n1 == MT7620A_CHIP_NAME1) { + name = "MT7620A"; + soc_info->compatible = "ralink,mt7620a-soc"; + } else { + panic("mt7620: unknown SoC, n0:%08x n1:%08x\n", n0, n1); + } + + rev = __raw_readl(sysc + SYSC_REG_CHIP_REV); + + snprintf(soc_info->sys_type, RAMIPS_SYS_TYPE_LEN, + "Ralink %s ver:%u eco:%u", + name, + (rev >> CHIP_REV_VER_SHIFT) & CHIP_REV_VER_MASK, + (rev & CHIP_REV_ECO_MASK)); + + cfg0 = __raw_readl(sysc + SYSC_REG_SYSTEM_CONFIG0); + dram_type = (cfg0 >> SYSCFG0_DRAM_TYPE_SHIFT) & SYSCFG0_DRAM_TYPE_MASK; + + switch (dram_type) { + case SYSCFG0_DRAM_TYPE_SDRAM: + soc_info->mem_size_min = MT7620_SDRAM_SIZE_MIN; + soc_info->mem_size_max = MT7620_SDRAM_SIZE_MAX; + break; + + case SYSCFG0_DRAM_TYPE_DDR1: + soc_info->mem_size_min = MT7620_DDR1_SIZE_MIN; + soc_info->mem_size_max = MT7620_DDR1_SIZE_MAX; + break; + + case SYSCFG0_DRAM_TYPE_DDR2: + soc_info->mem_size_min = MT7620_DDR2_SIZE_MIN; + soc_info->mem_size_max = MT7620_DDR2_SIZE_MAX; + break; + default: + BUG(); + } + soc_info->mem_base = MT7620_DRAM_BASE; +} diff --git a/arch/mips/ralink/of.c b/arch/mips/ralink/of.c index 4165e70775be..fb1569580def 100644 --- a/arch/mips/ralink/of.c +++ b/arch/mips/ralink/of.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -85,6 +86,14 @@ void __init plat_mem_setup(void) * parsed resulting in our memory appearing */ __dt_setup_arch(&__dtb_start); + + if (soc_info.mem_size) + add_memory_region(soc_info.mem_base, soc_info.mem_size, + BOOT_MEM_RAM); + else + detect_memory_region(soc_info.mem_base, + soc_info.mem_size_min * SZ_1M, + soc_info.mem_size_max * SZ_1M); } static int __init plat_of_setup(void) diff --git a/arch/mips/ralink/rt288x.c b/arch/mips/ralink/rt288x.c new file mode 100644 index 000000000000..f87de1ab2198 --- /dev/null +++ b/arch/mips/ralink/rt288x.c @@ -0,0 +1,143 @@ +/* + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * Parts of this file are based on Ralink's 2.6.21 BSP + * + * Copyright (C) 2008-2011 Gabor Juhos + * Copyright (C) 2008 Imre Kaloz + * Copyright (C) 2013 John Crispin + */ + +#include +#include +#include + +#include +#include +#include + +#include "common.h" + +static struct ralink_pinmux_grp mode_mux[] = { + { + .name = "i2c", + .mask = RT2880_GPIO_MODE_I2C, + .gpio_first = 1, + .gpio_last = 2, + }, { + .name = "spi", + .mask = RT2880_GPIO_MODE_SPI, + .gpio_first = 3, + .gpio_last = 6, + }, { + .name = "uartlite", + .mask = RT2880_GPIO_MODE_UART0, + .gpio_first = 7, + .gpio_last = 14, + }, { + .name = "jtag", + .mask = RT2880_GPIO_MODE_JTAG, + .gpio_first = 17, + .gpio_last = 21, + }, { + .name = "mdio", + .mask = RT2880_GPIO_MODE_MDIO, + .gpio_first = 22, + .gpio_last = 23, + }, { + .name = "sdram", + .mask = RT2880_GPIO_MODE_SDRAM, + .gpio_first = 24, + .gpio_last = 39, + }, { + .name = "pci", + .mask = RT2880_GPIO_MODE_PCI, + .gpio_first = 40, + .gpio_last = 71, + }, {0} +}; + +static void rt288x_wdt_reset(void) +{ + u32 t; + + /* enable WDT reset output on pin SRAM_CS_N */ + t = rt_sysc_r32(SYSC_REG_CLKCFG); + t |= CLKCFG_SRAM_CS_N_WDT; + rt_sysc_w32(t, SYSC_REG_CLKCFG); +} + +struct ralink_pinmux rt_gpio_pinmux = { + .mode = mode_mux, + .wdt_reset = rt288x_wdt_reset, +}; + +void __init ralink_clk_init(void) +{ + unsigned long cpu_rate; + u32 t = rt_sysc_r32(SYSC_REG_SYSTEM_CONFIG); + t = ((t >> SYSTEM_CONFIG_CPUCLK_SHIFT) & SYSTEM_CONFIG_CPUCLK_MASK); + + switch (t) { + case SYSTEM_CONFIG_CPUCLK_250: + cpu_rate = 250000000; + break; + case SYSTEM_CONFIG_CPUCLK_266: + cpu_rate = 266666667; + break; + case SYSTEM_CONFIG_CPUCLK_280: + cpu_rate = 280000000; + break; + case SYSTEM_CONFIG_CPUCLK_300: + cpu_rate = 300000000; + break; + } + + ralink_clk_add("cpu", cpu_rate); + ralink_clk_add("300100.timer", cpu_rate / 2); + ralink_clk_add("300120.watchdog", cpu_rate / 2); + ralink_clk_add("300500.uart", cpu_rate / 2); + ralink_clk_add("300c00.uartlite", cpu_rate / 2); + ralink_clk_add("400000.ethernet", cpu_rate / 2); +} + +void __init ralink_of_remap(void) +{ + rt_sysc_membase = plat_of_remap_node("ralink,rt2880-sysc"); + rt_memc_membase = plat_of_remap_node("ralink,rt2880-memc"); + + if (!rt_sysc_membase || !rt_memc_membase) + panic("Failed to remap core resources"); +} + +void prom_soc_init(struct ralink_soc_info *soc_info) +{ + void __iomem *sysc = (void __iomem *) KSEG1ADDR(RT2880_SYSC_BASE); + const char *name; + u32 n0; + u32 n1; + u32 id; + + n0 = __raw_readl(sysc + SYSC_REG_CHIP_NAME0); + n1 = __raw_readl(sysc + SYSC_REG_CHIP_NAME1); + id = __raw_readl(sysc + SYSC_REG_CHIP_ID); + + if (n0 == RT2880_CHIP_NAME0 && n1 == RT2880_CHIP_NAME1) { + soc_info->compatible = "ralink,r2880-soc"; + name = "RT2880"; + } else { + panic("rt288x: unknown SoC, n0:%08x n1:%08x", n0, n1); + } + + snprintf(soc_info->sys_type, RAMIPS_SYS_TYPE_LEN, + "Ralink %s id:%u rev:%u", + name, + (id >> CHIP_ID_ID_SHIFT) & CHIP_ID_ID_MASK, + (id & CHIP_ID_REV_MASK)); + + soc_info->mem_base = RT2880_SDRAM_BASE; + soc_info->mem_size_min = RT2880_MEM_SIZE_MIN; + soc_info->mem_size_max = RT2880_MEM_SIZE_MAX; +} diff --git a/arch/mips/ralink/rt305x.c b/arch/mips/ralink/rt305x.c index 0a4bbdcf59d9..ca7ee3a33790 100644 --- a/arch/mips/ralink/rt305x.c +++ b/arch/mips/ralink/rt305x.c @@ -22,7 +22,7 @@ enum rt305x_soc_type rt305x_soc; -struct ralink_pinmux_grp mode_mux[] = { +static struct ralink_pinmux_grp mode_mux[] = { { .name = "i2c", .mask = RT305X_GPIO_MODE_I2C, @@ -61,7 +61,7 @@ struct ralink_pinmux_grp mode_mux[] = { }, {0} }; -struct ralink_pinmux_grp uart_mux[] = { +static struct ralink_pinmux_grp uart_mux[] = { { .name = "uartf", .mask = RT305X_GPIO_MODE_UARTF, @@ -91,19 +91,19 @@ struct ralink_pinmux_grp uart_mux[] = { .name = "gpio uartf", .mask = RT305X_GPIO_MODE_GPIO_UARTF, .gpio_first = RT305X_GPIO_7, - .gpio_last = RT305X_GPIO_14, + .gpio_last = RT305X_GPIO_10, }, { .name = "gpio i2s", .mask = RT305X_GPIO_MODE_GPIO_I2S, .gpio_first = RT305X_GPIO_7, - .gpio_last = RT305X_GPIO_14, + .gpio_last = RT305X_GPIO_10, }, { .name = "gpio", .mask = RT305X_GPIO_MODE_GPIO, }, {0} }; -void rt305x_wdt_reset(void) +static void rt305x_wdt_reset(void) { u32 t; @@ -114,16 +114,53 @@ void rt305x_wdt_reset(void) rt_sysc_w32(t, SYSC_REG_SYSTEM_CONFIG); } -struct ralink_pinmux gpio_pinmux = { +struct ralink_pinmux rt_gpio_pinmux = { .mode = mode_mux, .uart = uart_mux, .uart_shift = RT305X_GPIO_MODE_UART0_SHIFT, + .uart_mask = RT305X_GPIO_MODE_UART0_MASK, .wdt_reset = rt305x_wdt_reset, }; +static unsigned long rt5350_get_mem_size(void) +{ + void __iomem *sysc = (void __iomem *) KSEG1ADDR(RT305X_SYSC_BASE); + unsigned long ret; + u32 t; + + t = __raw_readl(sysc + SYSC_REG_SYSTEM_CONFIG); + t = (t >> RT5350_SYSCFG0_DRAM_SIZE_SHIFT) & + RT5350_SYSCFG0_DRAM_SIZE_MASK; + + switch (t) { + case RT5350_SYSCFG0_DRAM_SIZE_2M: + ret = 2; + break; + case RT5350_SYSCFG0_DRAM_SIZE_8M: + ret = 8; + break; + case RT5350_SYSCFG0_DRAM_SIZE_16M: + ret = 16; + break; + case RT5350_SYSCFG0_DRAM_SIZE_32M: + ret = 32; + break; + case RT5350_SYSCFG0_DRAM_SIZE_64M: + ret = 64; + break; + default: + panic("rt5350: invalid DRAM size: %u", t); + break; + } + + return ret; +} + void __init ralink_clk_init(void) { unsigned long cpu_rate, sys_rate, wdt_rate, uart_rate; + unsigned long wmac_rate = 40000000; + u32 t = rt_sysc_r32(SYSC_REG_SYSTEM_CONFIG); if (soc_is_rt305x() || soc_is_rt3350()) { @@ -176,11 +213,21 @@ void __init ralink_clk_init(void) BUG(); } + if (soc_is_rt3352() || soc_is_rt5350()) { + u32 val = rt_sysc_r32(RT3352_SYSC_REG_SYSCFG0); + + if (!(val & RT3352_CLKCFG0_XTAL_SEL)) + wmac_rate = 20000000; + } + ralink_clk_add("cpu", cpu_rate); ralink_clk_add("10000b00.spi", sys_rate); ralink_clk_add("10000100.timer", wdt_rate); + ralink_clk_add("10000120.watchdog", wdt_rate); ralink_clk_add("10000500.uart", uart_rate); ralink_clk_add("10000c00.uartlite", uart_rate); + ralink_clk_add("10100000.ethernet", sys_rate); + ralink_clk_add("10180000.wmac", wmac_rate); } void __init ralink_of_remap(void) @@ -239,4 +286,15 @@ void prom_soc_init(struct ralink_soc_info *soc_info) name, (id >> CHIP_ID_ID_SHIFT) & CHIP_ID_ID_MASK, (id & CHIP_ID_REV_MASK)); + + soc_info->mem_base = RT305X_SDRAM_BASE; + if (soc_is_rt5350()) { + soc_info->mem_size = rt5350_get_mem_size(); + } else if (soc_is_rt305x() || soc_is_rt3350()) { + soc_info->mem_size_min = RT305X_MEM_SIZE_MIN; + soc_info->mem_size_max = RT305X_MEM_SIZE_MAX; + } else if (soc_is_rt3352()) { + soc_info->mem_size_min = RT3352_MEM_SIZE_MIN; + soc_info->mem_size_max = RT3352_MEM_SIZE_MAX; + } } diff --git a/arch/mips/ralink/rt3883.c b/arch/mips/ralink/rt3883.c new file mode 100644 index 000000000000..b474ac284b83 --- /dev/null +++ b/arch/mips/ralink/rt3883.c @@ -0,0 +1,246 @@ +/* + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * Parts of this file are based on Ralink's 2.6.21 BSP + * + * Copyright (C) 2008 Imre Kaloz + * Copyright (C) 2008-2011 Gabor Juhos + * Copyright (C) 2013 John Crispin + */ + +#include +#include +#include + +#include +#include +#include + +#include "common.h" + +static struct ralink_pinmux_grp mode_mux[] = { + { + .name = "i2c", + .mask = RT3883_GPIO_MODE_I2C, + .gpio_first = RT3883_GPIO_I2C_SD, + .gpio_last = RT3883_GPIO_I2C_SCLK, + }, { + .name = "spi", + .mask = RT3883_GPIO_MODE_SPI, + .gpio_first = RT3883_GPIO_SPI_CS0, + .gpio_last = RT3883_GPIO_SPI_MISO, + }, { + .name = "uartlite", + .mask = RT3883_GPIO_MODE_UART1, + .gpio_first = RT3883_GPIO_UART1_TXD, + .gpio_last = RT3883_GPIO_UART1_RXD, + }, { + .name = "jtag", + .mask = RT3883_GPIO_MODE_JTAG, + .gpio_first = RT3883_GPIO_JTAG_TDO, + .gpio_last = RT3883_GPIO_JTAG_TCLK, + }, { + .name = "mdio", + .mask = RT3883_GPIO_MODE_MDIO, + .gpio_first = RT3883_GPIO_MDIO_MDC, + .gpio_last = RT3883_GPIO_MDIO_MDIO, + }, { + .name = "ge1", + .mask = RT3883_GPIO_MODE_GE1, + .gpio_first = RT3883_GPIO_GE1_TXD0, + .gpio_last = RT3883_GPIO_GE1_RXCLK, + }, { + .name = "ge2", + .mask = RT3883_GPIO_MODE_GE2, + .gpio_first = RT3883_GPIO_GE2_TXD0, + .gpio_last = RT3883_GPIO_GE2_RXCLK, + }, { + .name = "pci", + .mask = RT3883_GPIO_MODE_PCI, + .gpio_first = RT3883_GPIO_PCI_AD0, + .gpio_last = RT3883_GPIO_PCI_AD31, + }, { + .name = "lna a", + .mask = RT3883_GPIO_MODE_LNA_A, + .gpio_first = RT3883_GPIO_LNA_PE_A0, + .gpio_last = RT3883_GPIO_LNA_PE_A2, + }, { + .name = "lna g", + .mask = RT3883_GPIO_MODE_LNA_G, + .gpio_first = RT3883_GPIO_LNA_PE_G0, + .gpio_last = RT3883_GPIO_LNA_PE_G2, + }, {0} +}; + +static struct ralink_pinmux_grp uart_mux[] = { + { + .name = "uartf", + .mask = RT3883_GPIO_MODE_UARTF, + .gpio_first = RT3883_GPIO_7, + .gpio_last = RT3883_GPIO_14, + }, { + .name = "pcm uartf", + .mask = RT3883_GPIO_MODE_PCM_UARTF, + .gpio_first = RT3883_GPIO_7, + .gpio_last = RT3883_GPIO_14, + }, { + .name = "pcm i2s", + .mask = RT3883_GPIO_MODE_PCM_I2S, + .gpio_first = RT3883_GPIO_7, + .gpio_last = RT3883_GPIO_14, + }, { + .name = "i2s uartf", + .mask = RT3883_GPIO_MODE_I2S_UARTF, + .gpio_first = RT3883_GPIO_7, + .gpio_last = RT3883_GPIO_14, + }, { + .name = "pcm gpio", + .mask = RT3883_GPIO_MODE_PCM_GPIO, + .gpio_first = RT3883_GPIO_11, + .gpio_last = RT3883_GPIO_14, + }, { + .name = "gpio uartf", + .mask = RT3883_GPIO_MODE_GPIO_UARTF, + .gpio_first = RT3883_GPIO_7, + .gpio_last = RT3883_GPIO_10, + }, { + .name = "gpio i2s", + .mask = RT3883_GPIO_MODE_GPIO_I2S, + .gpio_first = RT3883_GPIO_7, + .gpio_last = RT3883_GPIO_10, + }, { + .name = "gpio", + .mask = RT3883_GPIO_MODE_GPIO, + }, {0} +}; + +static struct ralink_pinmux_grp pci_mux[] = { + { + .name = "pci-dev", + .mask = 0, + .gpio_first = RT3883_GPIO_PCI_AD0, + .gpio_last = RT3883_GPIO_PCI_AD31, + }, { + .name = "pci-host2", + .mask = 1, + .gpio_first = RT3883_GPIO_PCI_AD0, + .gpio_last = RT3883_GPIO_PCI_AD31, + }, { + .name = "pci-host1", + .mask = 2, + .gpio_first = RT3883_GPIO_PCI_AD0, + .gpio_last = RT3883_GPIO_PCI_AD31, + }, { + .name = "pci-fnc", + .mask = 3, + .gpio_first = RT3883_GPIO_PCI_AD0, + .gpio_last = RT3883_GPIO_PCI_AD31, + }, { + .name = "pci-gpio", + .mask = 7, + .gpio_first = RT3883_GPIO_PCI_AD0, + .gpio_last = RT3883_GPIO_PCI_AD31, + }, {0} +}; + +static void rt3883_wdt_reset(void) +{ + u32 t; + + /* enable WDT reset output on GPIO 2 */ + t = rt_sysc_r32(RT3883_SYSC_REG_SYSCFG1); + t |= RT3883_SYSCFG1_GPIO2_AS_WDT_OUT; + rt_sysc_w32(t, RT3883_SYSC_REG_SYSCFG1); +} + +struct ralink_pinmux rt_gpio_pinmux = { + .mode = mode_mux, + .uart = uart_mux, + .uart_shift = RT3883_GPIO_MODE_UART0_SHIFT, + .uart_mask = RT3883_GPIO_MODE_UART0_MASK, + .wdt_reset = rt3883_wdt_reset, + .pci = pci_mux, + .pci_shift = RT3883_GPIO_MODE_PCI_SHIFT, + .pci_mask = RT3883_GPIO_MODE_PCI_MASK, +}; + +void __init ralink_clk_init(void) +{ + unsigned long cpu_rate, sys_rate; + u32 syscfg0; + u32 clksel; + u32 ddr2; + + syscfg0 = rt_sysc_r32(RT3883_SYSC_REG_SYSCFG0); + clksel = ((syscfg0 >> RT3883_SYSCFG0_CPUCLK_SHIFT) & + RT3883_SYSCFG0_CPUCLK_MASK); + ddr2 = syscfg0 & RT3883_SYSCFG0_DRAM_TYPE_DDR2; + + switch (clksel) { + case RT3883_SYSCFG0_CPUCLK_250: + cpu_rate = 250000000; + sys_rate = (ddr2) ? 125000000 : 83000000; + break; + case RT3883_SYSCFG0_CPUCLK_384: + cpu_rate = 384000000; + sys_rate = (ddr2) ? 128000000 : 96000000; + break; + case RT3883_SYSCFG0_CPUCLK_480: + cpu_rate = 480000000; + sys_rate = (ddr2) ? 160000000 : 120000000; + break; + case RT3883_SYSCFG0_CPUCLK_500: + cpu_rate = 500000000; + sys_rate = (ddr2) ? 166000000 : 125000000; + break; + } + + ralink_clk_add("cpu", cpu_rate); + ralink_clk_add("10000100.timer", sys_rate); + ralink_clk_add("10000120.watchdog", sys_rate); + ralink_clk_add("10000500.uart", 40000000); + ralink_clk_add("10000b00.spi", sys_rate); + ralink_clk_add("10000c00.uartlite", 40000000); + ralink_clk_add("10100000.ethernet", sys_rate); +} + +void __init ralink_of_remap(void) +{ + rt_sysc_membase = plat_of_remap_node("ralink,rt3883-sysc"); + rt_memc_membase = plat_of_remap_node("ralink,rt3883-memc"); + + if (!rt_sysc_membase || !rt_memc_membase) + panic("Failed to remap core resources"); +} + +void prom_soc_init(struct ralink_soc_info *soc_info) +{ + void __iomem *sysc = (void __iomem *) KSEG1ADDR(RT3883_SYSC_BASE); + const char *name; + u32 n0; + u32 n1; + u32 id; + + n0 = __raw_readl(sysc + RT3883_SYSC_REG_CHIPID0_3); + n1 = __raw_readl(sysc + RT3883_SYSC_REG_CHIPID4_7); + id = __raw_readl(sysc + RT3883_SYSC_REG_REVID); + + if (n0 == RT3883_CHIP_NAME0 && n1 == RT3883_CHIP_NAME1) { + soc_info->compatible = "ralink,rt3883-soc"; + name = "RT3883"; + } else { + panic("rt3883: unknown SoC, n0:%08x n1:%08x", n0, n1); + } + + snprintf(soc_info->sys_type, RAMIPS_SYS_TYPE_LEN, + "Ralink %s ver:%u eco:%u", + name, + (id >> RT3883_REVID_VER_ID_SHIFT) & RT3883_REVID_VER_ID_MASK, + (id & RT3883_REVID_ECO_ID_MASK)); + + soc_info->mem_base = RT3883_SDRAM_BASE; + soc_info->mem_size_min = RT3883_MEM_SIZE_MIN; + soc_info->mem_size_max = RT3883_MEM_SIZE_MAX; +} diff --git a/arch/mips/sgi-ip27/ip27-klnuma.c b/arch/mips/sgi-ip27/ip27-klnuma.c index 1d1919a44e88..7a53b1e28a93 100644 --- a/arch/mips/sgi-ip27/ip27-klnuma.c +++ b/arch/mips/sgi-ip27/ip27-klnuma.c @@ -114,7 +114,7 @@ void __init replicate_kernel_text() * data structures on the first couple of pages of the first slot of each * node. If this is the case, getfirstfree(node) > getslotstart(node, 0). */ -pfn_t node_getfirstfree(cnodeid_t cnode) +unsigned long node_getfirstfree(cnodeid_t cnode) { unsigned long loadbase = REP_BASE; nasid_t nasid = COMPACT_TO_NASID_NODEID(cnode); diff --git a/arch/mips/sgi-ip27/ip27-memory.c b/arch/mips/sgi-ip27/ip27-memory.c index 5f2bddb1860e..1230f56429d7 100644 --- a/arch/mips/sgi-ip27/ip27-memory.c +++ b/arch/mips/sgi-ip27/ip27-memory.c @@ -255,14 +255,14 @@ static void __init dump_topology(void) } } -static pfn_t __init slot_getbasepfn(cnodeid_t cnode, int slot) +static unsigned long __init slot_getbasepfn(cnodeid_t cnode, int slot) { nasid_t nasid = COMPACT_TO_NASID_NODEID(cnode); - return ((pfn_t)nasid << PFN_NASIDSHFT) | (slot << SLOT_PFNSHIFT); + return ((unsigned long)nasid << PFN_NASIDSHFT) | (slot << SLOT_PFNSHIFT); } -static pfn_t __init slot_psize_compute(cnodeid_t node, int slot) +static unsigned long __init slot_psize_compute(cnodeid_t node, int slot) { nasid_t nasid; lboard_t *brd; @@ -353,7 +353,7 @@ static void __init mlreset(void) static void __init szmem(void) { - pfn_t slot_psize, slot0sz = 0, nodebytes; /* Hack to detect problem configs */ + unsigned long slot_psize, slot0sz = 0, nodebytes; /* Hack to detect problem configs */ int slot; cnodeid_t node; @@ -390,10 +390,10 @@ static void __init szmem(void) static void __init node_mem_init(cnodeid_t node) { - pfn_t slot_firstpfn = slot_getbasepfn(node, 0); - pfn_t slot_freepfn = node_getfirstfree(node); + unsigned long slot_firstpfn = slot_getbasepfn(node, 0); + unsigned long slot_freepfn = node_getfirstfree(node); unsigned long bootmap_size; - pfn_t start_pfn, end_pfn; + unsigned long start_pfn, end_pfn; get_pfn_range_for_nid(node, &start_pfn, &end_pfn); @@ -467,7 +467,7 @@ void __init paging_init(void) pagetable_init(); for_each_online_node(node) { - pfn_t start_pfn, end_pfn; + unsigned long start_pfn, end_pfn; get_pfn_range_for_nid(node, &start_pfn, &end_pfn); diff --git a/arch/mips/sgi-ip27/ip27-timer.c b/arch/mips/sgi-ip27/ip27-timer.c index fff58ac176f3..2e21b761cb9c 100644 --- a/arch/mips/sgi-ip27/ip27-timer.c +++ b/arch/mips/sgi-ip27/ip27-timer.c @@ -69,7 +69,7 @@ static void rt_set_mode(enum clock_event_mode mode, /* Nothing to do ... */ } -int rt_timer_irq; +unsigned int rt_timer_irq; static DEFINE_PER_CPU(struct clock_event_device, hub_rt_clockevent); static DEFINE_PER_CPU(char [11], hub_rt_name); diff --git a/arch/mips/txx9/generic/setup.c b/arch/mips/txx9/generic/setup.c index 5524f2c7b05c..5364aabc2102 100644 --- a/arch/mips/txx9/generic/setup.c +++ b/arch/mips/txx9/generic/setup.c @@ -118,7 +118,7 @@ EXPORT_SYMBOL(clk_put); /* GPIO support */ -#ifdef CONFIG_GENERIC_GPIO +#ifdef CONFIG_GPIOLIB int gpio_to_irq(unsigned gpio) { return -EINVAL; diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig index 81b9ddbc9166..1072bfd18c50 100644 --- a/arch/openrisc/Kconfig +++ b/arch/openrisc/Kconfig @@ -44,9 +44,6 @@ config GENERIC_HWEIGHT config NO_IOPORT def_bool y -config GENERIC_GPIO - def_bool y - config TRACE_IRQFLAGS_SUPPORT def_bool y diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index bbbe02197afb..c33e3ad2c8fd 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -82,11 +82,6 @@ config GENERIC_HWEIGHT bool default y -config GENERIC_GPIO - bool - help - Generic GPIO API support - config PPC bool default y diff --git a/arch/powerpc/platforms/40x/Kconfig b/arch/powerpc/platforms/40x/Kconfig index bd40bbb15e14..6e287f1294fa 100644 --- a/arch/powerpc/platforms/40x/Kconfig +++ b/arch/powerpc/platforms/40x/Kconfig @@ -138,7 +138,6 @@ config PPC4xx_GPIO bool "PPC4xx GPIO support" depends on 40x select ARCH_REQUIRE_GPIOLIB - select GENERIC_GPIO help Enable gpiolib support for ppc40x based boards diff --git a/arch/powerpc/platforms/44x/Kconfig b/arch/powerpc/platforms/44x/Kconfig index 7be93367d92f..d6c7506ec7d9 100644 --- a/arch/powerpc/platforms/44x/Kconfig +++ b/arch/powerpc/platforms/44x/Kconfig @@ -248,7 +248,6 @@ config PPC4xx_GPIO bool "PPC4xx GPIO support" depends on 44x select ARCH_REQUIRE_GPIOLIB - select GENERIC_GPIO help Enable gpiolib support for ppc440 based boards diff --git a/arch/powerpc/platforms/85xx/Kconfig b/arch/powerpc/platforms/85xx/Kconfig index 8f02b05f4c96..efdd37c775ad 100644 --- a/arch/powerpc/platforms/85xx/Kconfig +++ b/arch/powerpc/platforms/85xx/Kconfig @@ -203,7 +203,6 @@ config GE_IMP3A select DEFAULT_UIMAGE select SWIOTLB select MMIO_NVRAM - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB select GE_FPGA help @@ -328,7 +327,7 @@ config B4_QDS select PPC_E500MC select PHYS_64BIT select SWIOTLB - select GENERIC_GPIO + select GPIOLIB select ARCH_REQUIRE_GPIOLIB select HAS_RAPIDIO select PPC_EPAPR_HV_PIC diff --git a/arch/powerpc/platforms/86xx/Kconfig b/arch/powerpc/platforms/86xx/Kconfig index 7a6279e38213..1afd1e4a2dd2 100644 --- a/arch/powerpc/platforms/86xx/Kconfig +++ b/arch/powerpc/platforms/86xx/Kconfig @@ -37,7 +37,6 @@ config GEF_PPC9A bool "GE PPC9A" select DEFAULT_UIMAGE select MMIO_NVRAM - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB select GE_FPGA help @@ -47,7 +46,6 @@ config GEF_SBC310 bool "GE SBC310" select DEFAULT_UIMAGE select MMIO_NVRAM - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB select GE_FPGA help @@ -57,7 +55,6 @@ config GEF_SBC610 bool "GE SBC610" select DEFAULT_UIMAGE select MMIO_NVRAM - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB select GE_FPGA select HAS_RAPIDIO diff --git a/arch/powerpc/platforms/8xx/Kconfig b/arch/powerpc/platforms/8xx/Kconfig index 1fb0b3cddeb3..8dec3c0911ad 100644 --- a/arch/powerpc/platforms/8xx/Kconfig +++ b/arch/powerpc/platforms/8xx/Kconfig @@ -114,7 +114,6 @@ config 8xx_COPYBACK config 8xx_GPIO bool "GPIO API Support" - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB help Saying Y here will cause the ports on an MPC8xx processor to be used diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig index 34d224be93ba..a881232a3cce 100644 --- a/arch/powerpc/platforms/Kconfig +++ b/arch/powerpc/platforms/Kconfig @@ -302,7 +302,6 @@ config QUICC_ENGINE config QE_GPIO bool "QE GPIO support" depends on QUICC_ENGINE - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB help Say Y here if you're going to use hardware that connects to the @@ -315,7 +314,6 @@ config CPM2 select PPC_LIB_RHEAP select PPC_PCI_CHOICE select ARCH_REQUIRE_GPIOLIB - select GENERIC_GPIO help The CPM2 (Communications Processor Module) is a coprocessor on embedded CPUs made by Freescale. Selecting this option means that @@ -353,7 +351,6 @@ config OF_RTC config SIMPLE_GPIO bool "Support for simple, memory-mapped GPIO controllers" depends on PPC - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB help Say Y here to support simple, memory-mapped GPIO controllers. @@ -364,7 +361,6 @@ config SIMPLE_GPIO config MCU_MPC8349EMITX bool "MPC8349E-mITX MCU driver" depends on I2C=y && PPC_83xx - select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB help Say Y here to enable soft power-off functionality on the Freescale diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index 78d8ace57272..8c868cf2cf93 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -93,9 +93,6 @@ config GENERIC_CSUM config GENERIC_HWEIGHT def_bool y -config GENERIC_GPIO - def_bool n - config GENERIC_CALIBRATE_DELAY bool diff --git a/arch/sh/boards/mach-sdk7786/Makefile b/arch/sh/boards/mach-sdk7786/Makefile index 8ae56e9560ac..45d32e3590b9 100644 --- a/arch/sh/boards/mach-sdk7786/Makefile +++ b/arch/sh/boards/mach-sdk7786/Makefile @@ -1,4 +1,4 @@ obj-y := fpga.o irq.o nmi.o setup.o -obj-$(CONFIG_GENERIC_GPIO) += gpio.o +obj-$(CONFIG_GPIOLIB) += gpio.o obj-$(CONFIG_HAVE_SRAM_POOL) += sram.o diff --git a/arch/sh/boards/mach-x3proto/Makefile b/arch/sh/boards/mach-x3proto/Makefile index 708c21c919ff..0cbe3d02dea3 100644 --- a/arch/sh/boards/mach-x3proto/Makefile +++ b/arch/sh/boards/mach-x3proto/Makefile @@ -1,3 +1,3 @@ obj-y += setup.o ilsel.o -obj-$(CONFIG_GENERIC_GPIO) += gpio.o +obj-$(CONFIG_GPIOLIB) += gpio.o diff --git a/arch/sh/kernel/cpu/sh2a/Makefile b/arch/sh/kernel/cpu/sh2a/Makefile index 7fdc102d0dd6..990195d98456 100644 --- a/arch/sh/kernel/cpu/sh2a/Makefile +++ b/arch/sh/kernel/cpu/sh2a/Makefile @@ -21,4 +21,4 @@ pinmux-$(CONFIG_CPU_SUBTYPE_SH7203) := pinmux-sh7203.o pinmux-$(CONFIG_CPU_SUBTYPE_SH7264) := pinmux-sh7264.o pinmux-$(CONFIG_CPU_SUBTYPE_SH7269) := pinmux-sh7269.o -obj-$(CONFIG_GENERIC_GPIO) += $(pinmux-y) +obj-$(CONFIG_GPIOLIB) += $(pinmux-y) diff --git a/arch/sh/kernel/cpu/sh3/Makefile b/arch/sh/kernel/cpu/sh3/Makefile index 6f13f33a35ff..d3634ae7b71a 100644 --- a/arch/sh/kernel/cpu/sh3/Makefile +++ b/arch/sh/kernel/cpu/sh3/Makefile @@ -30,4 +30,4 @@ clock-$(CONFIG_CPU_SUBTYPE_SH7712) := clock-sh7712.o pinmux-$(CONFIG_CPU_SUBTYPE_SH7720) := pinmux-sh7720.o obj-y += $(clock-y) -obj-$(CONFIG_GENERIC_GPIO) += $(pinmux-y) +obj-$(CONFIG_GPIOLIB) += $(pinmux-y) diff --git a/arch/sh/kernel/cpu/sh4a/Makefile b/arch/sh/kernel/cpu/sh4a/Makefile index 8fc6ec2be2fa..0705df775208 100644 --- a/arch/sh/kernel/cpu/sh4a/Makefile +++ b/arch/sh/kernel/cpu/sh4a/Makefile @@ -47,6 +47,6 @@ pinmux-$(CONFIG_CPU_SUBTYPE_SHX3) := pinmux-shx3.o obj-y += $(clock-y) obj-$(CONFIG_SMP) += $(smp-y) -obj-$(CONFIG_GENERIC_GPIO) += $(pinmux-y) +obj-$(CONFIG_GPIOLIB) += $(pinmux-y) obj-$(CONFIG_PERF_EVENTS) += perf_event.o obj-$(CONFIG_HAVE_HW_BREAKPOINT) += ubc.o diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index a639c0d07b8b..9ac9f1666339 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -137,11 +137,6 @@ config GENERIC_ISA_DMA bool default y if SPARC32 -config GENERIC_GPIO - bool - help - Generic GPIO API support - config ARCH_SUPPORTS_DEBUG_PAGEALLOC def_bool y if SPARC64 diff --git a/arch/tile/Kconfig b/arch/tile/Kconfig index 5b6a40dd5556..3aa37669ff8c 100644 --- a/arch/tile/Kconfig +++ b/arch/tile/Kconfig @@ -355,11 +355,17 @@ config HARDWALL config KERNEL_PL int "Processor protection level for kernel" range 1 2 - default "1" + default 2 if TILEGX + default 1 if !TILEGX ---help--- - This setting determines the processor protection level the - kernel will be built to run at. Generally you should use - the default value here. + Since MDE 4.2, the Tilera hypervisor runs the kernel + at PL2 by default. If running under an older hypervisor, + or as a KVM guest, you must run at PL1. (The current + hypervisor may also be recompiled with "make HV_PL=2" to + allow it to run a kernel at PL1, but clients running at PL1 + are not expected to be supported indefinitely.) + + If you're not sure, don't change the default. source "arch/tile/gxio/Kconfig" diff --git a/arch/tile/include/hv/hypervisor.h b/arch/tile/include/hv/hypervisor.h index ccd847e2347f..837dca5328c2 100644 --- a/arch/tile/include/hv/hypervisor.h +++ b/arch/tile/include/hv/hypervisor.h @@ -107,7 +107,22 @@ #define HV_DISPATCH_ENTRY_SIZE 32 /** Version of the hypervisor interface defined by this file */ -#define _HV_VERSION 11 +#define _HV_VERSION 13 + +/** Last version of the hypervisor interface with old hv_init() ABI. + * + * The change from version 12 to version 13 corresponds to launching + * the client by default at PL2 instead of PL1 (corresponding to the + * hv itself running at PL3 instead of PL2). To make this explicit, + * the hv_init() API was also extended so the client can report its + * desired PL, resulting in a more helpful failure diagnostic. If you + * call hv_init() with _HV_VERSION_OLD_HV_INIT and omit the client_pl + * argument, the hypervisor will assume client_pl = 1. + * + * Note that this is a deprecated solution and we do not expect to + * support clients of the Tilera hypervisor running at PL1 indefinitely. + */ +#define _HV_VERSION_OLD_HV_INIT 12 /* Index into hypervisor interface dispatch code blocks. * @@ -377,7 +392,11 @@ typedef int HV_Errno; #ifndef __ASSEMBLER__ /** Pass HV_VERSION to hv_init to request this version of the interface. */ -typedef enum { HV_VERSION = _HV_VERSION } HV_VersionNumber; +typedef enum { + HV_VERSION = _HV_VERSION, + HV_VERSION_OLD_HV_INIT = _HV_VERSION_OLD_HV_INIT, + +} HV_VersionNumber; /** Initializes the hypervisor. * @@ -385,9 +404,11 @@ typedef enum { HV_VERSION = _HV_VERSION } HV_VersionNumber; * that this program expects, typically HV_VERSION. * @param chip_num Architecture number of the chip the client was built for. * @param chip_rev_num Revision number of the chip the client was built for. + * @param client_pl Privilege level the client is built for + * (not required if interface_version_number == HV_VERSION_OLD_HV_INIT). */ void hv_init(HV_VersionNumber interface_version_number, - int chip_num, int chip_rev_num); + int chip_num, int chip_rev_num, int client_pl); /** Queries we can make for hv_sysconf(). diff --git a/arch/tile/kernel/head_32.S b/arch/tile/kernel/head_32.S index f71bfeeaf1a9..ac115307e5e4 100644 --- a/arch/tile/kernel/head_32.S +++ b/arch/tile/kernel/head_32.S @@ -38,7 +38,7 @@ ENTRY(_start) movei r2, TILE_CHIP_REV } { - moveli r0, _HV_VERSION + moveli r0, _HV_VERSION_OLD_HV_INIT jal hv_init } /* Get a reasonable default ASID in r0 */ diff --git a/arch/tile/kernel/head_64.S b/arch/tile/kernel/head_64.S index f9a2734f7b82..6093964fa5c7 100644 --- a/arch/tile/kernel/head_64.S +++ b/arch/tile/kernel/head_64.S @@ -34,13 +34,19 @@ ENTRY(_start) /* Notify the hypervisor of what version of the API we want */ { +#if KERNEL_PL == 1 && _HV_VERSION == 13 + /* Support older hypervisors by asking for API version 12. */ + movei r0, _HV_VERSION_OLD_HV_INIT +#else + movei r0, _HV_VERSION +#endif movei r1, TILE_CHIP - movei r2, TILE_CHIP_REV } { - moveli r0, _HV_VERSION - jal hv_init + movei r2, TILE_CHIP_REV + movei r3, KERNEL_PL } + jal hv_init /* Get a reasonable default ASID in r0 */ { move r0, zero diff --git a/arch/tile/lib/spinlock_32.c b/arch/tile/lib/spinlock_32.c index b16ac49a968e..b34f79aada48 100644 --- a/arch/tile/lib/spinlock_32.c +++ b/arch/tile/lib/spinlock_32.c @@ -101,7 +101,7 @@ EXPORT_SYMBOL(arch_spin_unlock_wait); * preserve the semantic that the same read lock can be acquired in an * interrupt context. */ -inline int arch_read_trylock(arch_rwlock_t *rwlock) +int arch_read_trylock(arch_rwlock_t *rwlock) { u32 val; __insn_mtspr(SPR_INTERRUPT_CRITICAL_SECTION, 1); diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig index 2943e3acdf0c..41bcc0013442 100644 --- a/arch/unicore32/Kconfig +++ b/arch/unicore32/Kconfig @@ -23,9 +23,6 @@ config UNICORE32 designs licensed by PKUnity Ltd. Please see web page at . -config GENERIC_GPIO - def_bool y - config GENERIC_CSUM def_bool y @@ -156,7 +153,7 @@ source "mm/Kconfig" config LEDS def_bool y - depends on GENERIC_GPIO + depends on GPIOLIB config ALIGNMENT_TRAP def_bool y @@ -219,7 +216,6 @@ if ARCH_PUV3 config PUV3_GPIO bool depends on !ARCH_FPGA - select GENERIC_GPIO select GPIO_SYSFS default y diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5db2117ae288..6a154a91c7e7 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -174,9 +174,6 @@ config GENERIC_BUG_RELATIVE_POINTERS config GENERIC_HWEIGHT def_bool y -config GENERIC_GPIO - bool - config ARCH_MAY_HAVE_PC_FDC def_bool y depends on ISA_DMA_API diff --git a/arch/x86/pci/mrst.c b/arch/x86/pci/mrst.c index 6eb18c42a28a..0e0fabf17342 100644 --- a/arch/x86/pci/mrst.c +++ b/arch/x86/pci/mrst.c @@ -141,6 +141,11 @@ static int pci_device_update_fixed(struct pci_bus *bus, unsigned int devfn, */ static bool type1_access_ok(unsigned int bus, unsigned int devfn, int reg) { + if (bus == 0 && (devfn == PCI_DEVFN(2, 0) + || devfn == PCI_DEVFN(0, 0) + || devfn == PCI_DEVFN(3, 0))) + return 1; + /* This is a workaround for A0 LNC bug where PCI status register does * not have new CAP bit set. can not be written by SW either. * @@ -150,10 +155,7 @@ static bool type1_access_ok(unsigned int bus, unsigned int devfn, int reg) */ if (reg >= 0x100 || reg == PCI_STATUS || reg == PCI_HEADER_TYPE) return 0; - if (bus == 0 && (devfn == PCI_DEVFN(2, 0) - || devfn == PCI_DEVFN(0, 0) - || devfn == PCI_DEVFN(3, 0))) - return 1; + return 0; /* langwell on others */ } diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig index b09de49dbec5..0a1b95f81a32 100644 --- a/arch/xtensa/Kconfig +++ b/arch/xtensa/Kconfig @@ -1,11 +1,9 @@ -config FRAME_POINTER - def_bool n - config ZONE_DMA def_bool y config XTENSA def_bool y + select ARCH_WANT_FRAME_POINTERS select HAVE_IDE select GENERIC_ATOMIC64 select HAVE_GENERIC_HARDIRQS @@ -33,9 +31,6 @@ config RWSEM_XCHGADD_ALGORITHM config GENERIC_HWEIGHT def_bool y -config GENERIC_GPIO - bool - config ARCH_HAS_ILOG2_U32 def_bool n @@ -52,6 +47,15 @@ config HZ source "init/Kconfig" source "kernel/Kconfig.freezer" +config LOCKDEP_SUPPORT + def_bool y + +config STACKTRACE_SUPPORT + def_bool y + +config TRACE_IRQFLAGS_SUPPORT + def_bool y + config MMU def_bool n @@ -103,6 +107,35 @@ config MATH_EMULATION help Can we use information of configuration file? +config INITIALIZE_XTENSA_MMU_INSIDE_VMLINUX + bool "Initialize Xtensa MMU inside the Linux kernel code" + default y + help + Earlier version initialized the MMU in the exception vector + before jumping to _startup in head.S and had an advantage that + it was possible to place a software breakpoint at 'reset' and + then enter your normal kernel breakpoints once the MMU was mapped + to the kernel mappings (0XC0000000). + + This unfortunately doesn't work for U-Boot and likley also wont + work for using KEXEC to have a hot kernel ready for doing a + KDUMP. + + So now the MMU is initialized in head.S but it's necessary to + use hardware breakpoints (gdb 'hbreak' cmd) to break at _startup. + xt-gdb can't place a Software Breakpoint in the 0XD region prior + to mapping the MMU and after mapping even if the area of low memory + was mapped gdb wouldn't remove the breakpoint on hitting it as the + PC wouldn't match. Since Hardware Breakpoints are recommended for + Linux configurations it seems reasonable to just assume they exist + and leave this older mechanism for unfortunate souls that choose + not to follow Tensilica's recommendation. + + Selecting this will cause U-Boot to set the KERNEL Load and Entry + address at 0x00003000 instead of the mapped std of 0xD0003000. + + If in doubt, say Y. + endmenu config XTENSA_CALIBRATE_CCOUNT @@ -252,21 +285,6 @@ endmenu menu "Executable file formats" -# only elf supported -config KCORE_ELF - def_bool y - depends on PROC_FS - help - If you enabled support for /proc file system then the file - /proc/kcore will contain the kernel core image in ELF format. This - can be used in gdb: - - $ cd /usr/src/linux ; gdb vmlinux /proc/kcore - - This is especially useful if you have compiled the kernel with the - "-g" option to preserve debugging information. It is mainly used - for examining kernel data structures on the live kernel. - source "fs/Kconfig.binfmt" endmenu diff --git a/arch/xtensa/boot/boot-elf/Makefile b/arch/xtensa/boot/boot-elf/Makefile index 1fe01b78c124..89db089f5a12 100644 --- a/arch/xtensa/boot/boot-elf/Makefile +++ b/arch/xtensa/boot/boot-elf/Makefile @@ -12,6 +12,7 @@ endif export OBJCOPY_ARGS export CPPFLAGS_boot.lds += -P -C +export KBUILD_AFLAGS += -mtext-section-literals boot-y := bootstrap.o diff --git a/arch/xtensa/boot/boot-elf/boot.lds.S b/arch/xtensa/boot/boot-elf/boot.lds.S index 7b646e0a6486..932b58ef33d4 100644 --- a/arch/xtensa/boot/boot-elf/boot.lds.S +++ b/arch/xtensa/boot/boot-elf/boot.lds.S @@ -1,41 +1,29 @@ -#include +/* + * linux/arch/xtensa/boot/boot-elf/boot.lds.S + * + * Copyright (C) 2008 - 2013 by Tensilica Inc. + * + * Chris Zankel + * Marc Gauthier + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include OUTPUT_ARCH(xtensa) ENTRY(_ResetVector) SECTIONS { - .start 0xD0000000 : { *(.start) } - - .text 0xD0000000: - { - __reloc_start = . ; - _text_start = . ; - *(.literal .text.literal .text) - _text_end = . ; - } - - .rodata ALIGN(0x04): - { - *(.rodata) - *(.rodata1) - } - - .data ALIGN(0x04): + .ResetVector.text XCHAL_RESET_VECTOR_VADDR : { - *(.data) - *(.data1) - *(.sdata) - *(.sdata2) - *(.got.plt) - *(.got) - *(.dynamic) + *(.ResetVector.text) } - __reloc_end = . ; - - . = ALIGN(0x10); - __image_load = . ; - .image 0xd0001000: + .image KERNELOFFSET: AT (LOAD_MEMORY_ADDRESS) { _image_start = .; *(image) @@ -43,7 +31,6 @@ SECTIONS _image_end = . ; } - .bss ((LOADADDR(.image) + SIZEOF(.image) + 3) & ~ 3): { __bss_start = .; @@ -53,14 +40,15 @@ SECTIONS *(.bss) __bss_end = .; } - _end = .; - _param_start = .; - .ResetVector.text XCHAL_RESET_VECTOR_VADDR : + /* + * This is a remapped copy of the Reset Vector Code. + * It keeps gdb in sync with the PC after switching + * to the temporary mapping used while setting up + * the V2 MMU mappings for Linux. + */ + .ResetVector.remapped_text 0x46000000 (INFO): { - *(.ResetVector.text) + *(.ResetVector.remapped_text) } - - - PROVIDE (end = .); } diff --git a/arch/xtensa/boot/boot-elf/bootstrap.S b/arch/xtensa/boot/boot-elf/bootstrap.S index 464298bc348b..1388a499753b 100644 --- a/arch/xtensa/boot/boot-elf/bootstrap.S +++ b/arch/xtensa/boot/boot-elf/bootstrap.S @@ -1,18 +1,77 @@ +/* + * arch/xtensa/boot/boot-elf/bootstrap.S + * + * Low-level exception handling + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2004 - 2013 by Tensilica Inc. + * + * Chris Zankel + * Marc Gauthier + * Piet Delaney + */ #include +#include +#include +#include +#include +#include +#include - -/* ResetVector - */ - .section .ResetVector.text, "ax" + .section .ResetVector.text, "ax" .global _ResetVector + .global reset + _ResetVector: - _j reset + _j _SetupMMU + + .begin no-absolute-literals + .literal_position + .align 4 RomInitAddr: - .word 0xd0001000 +#if defined(CONFIG_INITIALIZE_XTENSA_MMU_INSIDE_VMLINUX) && \ + XCHAL_HAVE_PTP_MMU && XCHAL_HAVE_SPANNING_WAY + .word 0x00003000 +#else + .word 0xd0003000 +#endif RomBootParam: .word _bootparam +_bootparam: + .short BP_TAG_FIRST + .short 4 + .long BP_VERSION + .short BP_TAG_LAST + .short 0 + .long 0 + + .align 4 +_SetupMMU: + movi a0, 0 + wsr a0, windowbase + rsync + movi a0, 1 + wsr a0, windowstart + rsync + movi a0, 0x1F + wsr a0, ps + rsync + + Offset = _SetupMMU - _ResetVector + +#ifndef CONFIG_INITIALIZE_XTENSA_MMU_INSIDE_VMLINUX + initialize_mmu +#endif + + .end no-absolute-literals + + rsil a0, XCHAL_DEBUGLEVEL-1 + rsync reset: l32r a0, RomInitAddr l32r a2, RomBootParam @@ -21,13 +80,25 @@ reset: jx a0 .align 4 - .section .bootstrap.data, "aw" - .globl _bootparam -_bootparam: - .short BP_TAG_FIRST - .short 4 - .long BP_VERSION - .short BP_TAG_LAST - .short 0 - .long 0 + .section .ResetVector.remapped_text, "x" + .global _RemappedResetVector + + /* Do org before literals */ + .org 0 + +_RemappedResetVector: + .begin no-absolute-literals + .literal_position + + _j _RemappedSetupMMU + + /* Position Remapped code at the same location as the original code */ + . = _RemappedResetVector + Offset + +_RemappedSetupMMU: +#ifndef CONFIG_INITIALIZE_XTENSA_MMU_INSIDE_VMLINUX + initialize_mmu +#endif + + .end no-absolute-literals diff --git a/arch/xtensa/boot/boot-redboot/boot.ld b/arch/xtensa/boot/boot-redboot/boot.ld index 5bbcaf9e830d..b0b9e95b58bd 100644 --- a/arch/xtensa/boot/boot-redboot/boot.ld +++ b/arch/xtensa/boot/boot-redboot/boot.ld @@ -33,7 +33,7 @@ SECTIONS . = ALIGN(0x10); __image_load = . ; - .image 0xd0001000: AT(__image_load) + .image 0xd0003000: AT(__image_load) { _image_start = .; *(image) diff --git a/arch/xtensa/boot/boot-uboot/Makefile b/arch/xtensa/boot/boot-uboot/Makefile index bfbf8af582f1..545759819ef9 100644 --- a/arch/xtensa/boot/boot-uboot/Makefile +++ b/arch/xtensa/boot/boot-uboot/Makefile @@ -4,7 +4,11 @@ # for more details. # -UIMAGE_LOADADDR = 0xd0001000 +ifdef CONFIG_INITIALIZE_XTENSA_MMU_INSIDE_VMLINUX +UIMAGE_LOADADDR = 0x00003000 +else +UIMAGE_LOADADDR = 0xd0003000 +endif UIMAGE_COMPRESSION = gzip $(obj)/../uImage: vmlinux.bin.gz FORCE diff --git a/arch/xtensa/configs/iss_defconfig b/arch/xtensa/configs/iss_defconfig index ddab37b24741..77c52f80187a 100644 --- a/arch/xtensa/configs/iss_defconfig +++ b/arch/xtensa/configs/iss_defconfig @@ -10,7 +10,6 @@ CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_FIND_NEXT_BIT=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_HARDIRQS=y -CONFIG_GENERIC_GPIO=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_NO_IOPORT=y diff --git a/arch/xtensa/configs/s6105_defconfig b/arch/xtensa/configs/s6105_defconfig index eaf1b8fc6556..4799c6a526b5 100644 --- a/arch/xtensa/configs/s6105_defconfig +++ b/arch/xtensa/configs/s6105_defconfig @@ -10,7 +10,6 @@ CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_FIND_NEXT_BIT=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_HARDIRQS=y -CONFIG_GENERIC_GPIO=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_NO_IOPORT=y diff --git a/arch/xtensa/include/asm/Kbuild b/arch/xtensa/include/asm/Kbuild index 095f0a2244f7..1b982641ec35 100644 --- a/arch/xtensa/include/asm/Kbuild +++ b/arch/xtensa/include/asm/Kbuild @@ -15,6 +15,7 @@ generic-y += irq_regs.h generic-y += kdebug.h generic-y += kmap_types.h generic-y += kvm_para.h +generic-y += linkage.h generic-y += local.h generic-y += local64.h generic-y += percpu.h diff --git a/arch/xtensa/include/asm/ftrace.h b/arch/xtensa/include/asm/ftrace.h index 40a8c178f10d..36dc7a684397 100644 --- a/arch/xtensa/include/asm/ftrace.h +++ b/arch/xtensa/include/asm/ftrace.h @@ -1 +1,33 @@ -/* empty */ +/* + * arch/xtensa/include/asm/ftrace.h + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2013 Tensilica Inc. + */ +#ifndef _XTENSA_FTRACE_H +#define _XTENSA_FTRACE_H + +#include + +#define HAVE_ARCH_CALLER_ADDR +#define CALLER_ADDR0 ({ unsigned long a0, a1; \ + __asm__ __volatile__ ( \ + "mov %0, a0\n" \ + "mov %1, a1\n" \ + : "=r"(a0), "=r"(a1) : : ); \ + MAKE_PC_FROM_RA(a0, a1); }) +#ifdef CONFIG_FRAME_POINTER +extern unsigned long return_address(unsigned level); +#define CALLER_ADDR1 return_address(1) +#define CALLER_ADDR2 return_address(2) +#define CALLER_ADDR3 return_address(3) +#else +#define CALLER_ADDR1 (0) +#define CALLER_ADDR2 (0) +#define CALLER_ADDR3 (0) +#endif + +#endif /* _XTENSA_FTRACE_H */ diff --git a/arch/xtensa/include/asm/initialize_mmu.h b/arch/xtensa/include/asm/initialize_mmu.h index e1f8ba4061ed..722553f17db3 100644 --- a/arch/xtensa/include/asm/initialize_mmu.h +++ b/arch/xtensa/include/asm/initialize_mmu.h @@ -23,6 +23,9 @@ #ifndef _XTENSA_INITIALIZE_MMU_H #define _XTENSA_INITIALIZE_MMU_H +#include +#include + #ifdef __ASSEMBLY__ #define XTENSA_HWVERSION_RC_2009_0 230000 @@ -48,6 +51,110 @@ * (XCHAL_HW_MIN_VERSION >= XTENSA_HWVERSION_RC_2009_0) */ +#if defined(CONFIG_MMU) && XCHAL_HAVE_PTP_MMU && XCHAL_HAVE_SPANNING_WAY +/* + * Have MMU v3 + */ + +#if !XCHAL_HAVE_VECBASE +# error "MMU v3 requires reloc vectors" +#endif + + movi a1, 0 + _call0 1f + _j 2f + + .align 4 +1: movi a2, 0x10000000 + movi a3, 0x18000000 + add a2, a2, a0 +9: bgeu a2, a3, 9b /* PC is out of the expected range */ + + /* Step 1: invalidate mapping at 0x40000000..0x5FFFFFFF. */ + + movi a2, 0x40000006 + idtlb a2 + iitlb a2 + isync + + /* Step 2: map 0x40000000..0x47FFFFFF to paddr containing this code + * and jump to the new mapping. + */ +#define CA_BYPASS (_PAGE_CA_BYPASS | _PAGE_HW_WRITE | _PAGE_HW_EXEC) +#define CA_WRITEBACK (_PAGE_CA_WB | _PAGE_HW_WRITE | _PAGE_HW_EXEC) + + srli a3, a0, 27 + slli a3, a3, 27 + addi a3, a3, CA_BYPASS + addi a7, a2, -1 + wdtlb a3, a7 + witlb a3, a7 + isync + + slli a4, a0, 5 + srli a4, a4, 5 + addi a5, a2, -6 + add a4, a4, a5 + jx a4 + + /* Step 3: unmap everything other than current area. + * Start at 0x60000000, wrap around, and end with 0x20000000 + */ +2: movi a4, 0x20000000 + add a5, a2, a4 +3: idtlb a5 + iitlb a5 + add a5, a5, a4 + bne a5, a2, 3b + + /* Step 4: Setup MMU with the old V2 mappings. */ + movi a6, 0x01000000 + wsr a6, ITLBCFG + wsr a6, DTLBCFG + isync + + movi a5, 0xd0000005 + movi a4, CA_WRITEBACK + wdtlb a4, a5 + witlb a4, a5 + + movi a5, 0xd8000005 + movi a4, CA_BYPASS + wdtlb a4, a5 + witlb a4, a5 + + movi a5, 0xe0000006 + movi a4, 0xf0000000 + CA_WRITEBACK + wdtlb a4, a5 + witlb a4, a5 + + movi a5, 0xf0000006 + movi a4, 0xf0000000 + CA_BYPASS + wdtlb a4, a5 + witlb a4, a5 + + isync + + /* Jump to self, using MMU v2 mappings. */ + movi a4, 1f + jx a4 + +1: + movi a2, VECBASE_RESET_VADDR + wsr a2, vecbase + + /* Step 5: remove temporary mapping. */ + idtlb a7 + iitlb a7 + isync + + movi a0, 0 + wsr a0, ptevaddr + rsync + +#endif /* defined(CONFIG_MMU) && XCHAL_HAVE_PTP_MMU && + XCHAL_HAVE_SPANNING_WAY */ + .endm #endif /*__ASSEMBLY__*/ diff --git a/arch/xtensa/include/asm/irqflags.h b/arch/xtensa/include/asm/irqflags.h index f865b1c1eae4..ea36674c6ec5 100644 --- a/arch/xtensa/include/asm/irqflags.h +++ b/arch/xtensa/include/asm/irqflags.h @@ -47,7 +47,10 @@ static inline void arch_local_irq_restore(unsigned long flags) static inline bool arch_irqs_disabled_flags(unsigned long flags) { - return (flags & 0xf) != 0; +#if XCHAL_EXCM_LEVEL < LOCKLEVEL || (1 << PS_EXCM_BIT) < LOCKLEVEL +#error "XCHAL_EXCM_LEVEL and 1<= LOCKLEVEL; } static inline bool arch_irqs_disabled(void) diff --git a/arch/xtensa/include/asm/linkage.h b/arch/xtensa/include/asm/linkage.h deleted file mode 100644 index bf2128a99d79..000000000000 --- a/arch/xtensa/include/asm/linkage.h +++ /dev/null @@ -1,16 +0,0 @@ -/* - * include/asm-xtensa/linkage.h - * - * This file is subject to the terms and conditions of the GNU General Public - * License. See the file "COPYING" in the main directory of this archive - * for more details. - * - * Copyright (C) 2001 - 2005 Tensilica Inc. - */ - -#ifndef _XTENSA_LINKAGE_H -#define _XTENSA_LINKAGE_H - -/* Nothing to do here ... */ - -#endif /* _XTENSA_LINKAGE_H */ diff --git a/arch/xtensa/include/asm/stacktrace.h b/arch/xtensa/include/asm/stacktrace.h new file mode 100644 index 000000000000..6a05fcb0a20d --- /dev/null +++ b/arch/xtensa/include/asm/stacktrace.h @@ -0,0 +1,36 @@ +/* + * arch/xtensa/include/asm/stacktrace.h + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2001 - 2013 Tensilica Inc. + */ +#ifndef _XTENSA_STACKTRACE_H +#define _XTENSA_STACKTRACE_H + +#include + +struct stackframe { + unsigned long pc; + unsigned long sp; +}; + +static __always_inline unsigned long *stack_pointer(struct task_struct *task) +{ + unsigned long *sp; + + if (!task || task == current) + __asm__ __volatile__ ("mov %0, a1\n" : "=a"(sp)); + else + sp = (unsigned long *)task->thread.sp; + + return sp; +} + +void walk_stackframe(unsigned long *sp, + int (*fn)(struct stackframe *frame, void *data), + void *data); + +#endif /* _XTENSA_STACKTRACE_H */ diff --git a/arch/xtensa/include/asm/timex.h b/arch/xtensa/include/asm/timex.h index 9e85ce8bd8dd..3d35e5d0367e 100644 --- a/arch/xtensa/include/asm/timex.h +++ b/arch/xtensa/include/asm/timex.h @@ -19,13 +19,16 @@ #define _INTLEVEL(x) XCHAL_INT ## x ## _LEVEL #define INTLEVEL(x) _INTLEVEL(x) -#if INTLEVEL(XCHAL_TIMER0_INTERRUPT) <= XCHAL_EXCM_LEVEL +#if XCHAL_NUM_TIMERS > 0 && \ + INTLEVEL(XCHAL_TIMER0_INTERRUPT) <= XCHAL_EXCM_LEVEL # define LINUX_TIMER 0 # define LINUX_TIMER_INT XCHAL_TIMER0_INTERRUPT -#elif INTLEVEL(XCHAL_TIMER1_INTERRUPT) <= XCHAL_EXCM_LEVEL +#elif XCHAL_NUM_TIMERS > 1 && \ + INTLEVEL(XCHAL_TIMER1_INTERRUPT) <= XCHAL_EXCM_LEVEL # define LINUX_TIMER 1 # define LINUX_TIMER_INT XCHAL_TIMER1_INTERRUPT -#elif INTLEVEL(XCHAL_TIMER2_INTERRUPT) <= XCHAL_EXCM_LEVEL +#elif XCHAL_NUM_TIMERS > 2 && \ + INTLEVEL(XCHAL_TIMER2_INTERRUPT) <= XCHAL_EXCM_LEVEL # define LINUX_TIMER 2 # define LINUX_TIMER_INT XCHAL_TIMER2_INTERRUPT #else diff --git a/arch/xtensa/include/asm/traps.h b/arch/xtensa/include/asm/traps.h index b5464ef3cf66..917488a0ab00 100644 --- a/arch/xtensa/include/asm/traps.h +++ b/arch/xtensa/include/asm/traps.h @@ -22,10 +22,9 @@ extern void do_unhandled(struct pt_regs *regs, unsigned long exccause); static inline void spill_registers(void) { - unsigned int a0, ps; __asm__ __volatile__ ( - "movi a14, " __stringify(PS_EXCM_BIT | LOCKLEVEL) "\n\t" + "movi a14, "__stringify((1 << PS_EXCM_BIT) | LOCKLEVEL)"\n\t" "mov a12, a0\n\t" "rsr a13, sar\n\t" "xsr a14, ps\n\t" @@ -35,7 +34,7 @@ static inline void spill_registers(void) "mov a0, a12\n\t" "wsr a13, sar\n\t" "wsr a14, ps\n\t" - : : "a" (&a0), "a" (&ps) + : : #if defined(CONFIG_FRAME_POINTER) : "a2", "a3", "a4", "a11", "a12", "a13", "a14", "a15", #else diff --git a/arch/xtensa/include/asm/vectors.h b/arch/xtensa/include/asm/vectors.h new file mode 100644 index 000000000000..c52b656d0310 --- /dev/null +++ b/arch/xtensa/include/asm/vectors.h @@ -0,0 +1,125 @@ +/* + * arch/xtensa/include/asm/xchal_vaddr_remap.h + * + * Xtensa macros for MMU V3 Support. Deals with re-mapping the Virtual + * Memory Addresses from "Virtual == Physical" to their prevvious V2 MMU + * mappings (KSEG at 0xD0000000 and KIO at 0XF0000000). + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2008 - 2012 Tensilica Inc. + * + * Pete Delaney + * Marc Gauthier + +#if defined(CONFIG_MMU) + +/* Will Become VECBASE */ +#define VIRTUAL_MEMORY_ADDRESS 0xD0000000 + +/* Image Virtual Start Address */ +#define KERNELOFFSET 0xD0003000 + +#if defined(XCHAL_HAVE_PTP_MMU) && XCHAL_HAVE_PTP_MMU && XCHAL_HAVE_SPANNING_WAY + /* MMU v3 - XCHAL_HAVE_PTP_MMU == 1 */ + #define PHYSICAL_MEMORY_ADDRESS 0x00000000 + #define LOAD_MEMORY_ADDRESS 0x00003000 +#else + /* MMU V2 - XCHAL_HAVE_PTP_MMU == 0 */ + #define PHYSICAL_MEMORY_ADDRESS 0xD0000000 + #define LOAD_MEMORY_ADDRESS 0xD0003000 +#endif + +#else /* !defined(CONFIG_MMU) */ + /* MMU Not being used - Virtual == Physical */ + + /* VECBASE */ + #define VIRTUAL_MEMORY_ADDRESS 0x00002000 + + /* Location of the start of the kernel text, _start */ + #define KERNELOFFSET 0x00003000 + #define PHYSICAL_MEMORY_ADDRESS 0x00000000 + + /* Loaded just above possibly live vectors */ + #define LOAD_MEMORY_ADDRESS 0x00003000 + +#endif /* CONFIG_MMU */ + +#define XC_VADDR(offset) (VIRTUAL_MEMORY_ADDRESS + offset) +#define XC_PADDR(offset) (PHYSICAL_MEMORY_ADDRESS + offset) + +/* Used to set VECBASE register */ +#define VECBASE_RESET_VADDR VIRTUAL_MEMORY_ADDRESS + +#define RESET_VECTOR_VECOFS (XCHAL_RESET_VECTOR_VADDR - \ + VECBASE_RESET_VADDR) +#define RESET_VECTOR_VADDR XC_VADDR(RESET_VECTOR_VECOFS) + +#define RESET_VECTOR1_VECOFS (XCHAL_RESET_VECTOR1_VADDR - \ + VECBASE_RESET_VADDR) +#define RESET_VECTOR1_VADDR XC_VADDR(RESET_VECTOR1_VECOFS) + +#if XCHAL_HAVE_VECBASE + +#define USER_VECTOR_VADDR XC_VADDR(XCHAL_USER_VECOFS) +#define KERNEL_VECTOR_VADDR XC_VADDR(XCHAL_KERNEL_VECOFS) +#define DOUBLEEXC_VECTOR_VADDR XC_VADDR(XCHAL_DOUBLEEXC_VECOFS) +#define WINDOW_VECTORS_VADDR XC_VADDR(XCHAL_WINDOW_OF4_VECOFS) +#define INTLEVEL2_VECTOR_VADDR XC_VADDR(XCHAL_INTLEVEL2_VECOFS) +#define INTLEVEL3_VECTOR_VADDR XC_VADDR(XCHAL_INTLEVEL3_VECOFS) +#define INTLEVEL4_VECTOR_VADDR XC_VADDR(XCHAL_INTLEVEL4_VECOFS) +#define INTLEVEL5_VECTOR_VADDR XC_VADDR(XCHAL_INTLEVEL5_VECOFS) +#define INTLEVEL6_VECTOR_VADDR XC_VADDR(XCHAL_INTLEVEL6_VECOFS) + +#define DEBUG_VECTOR_VADDR XC_VADDR(XCHAL_DEBUG_VECOFS) + +#undef XCHAL_NMI_VECTOR_VADDR +#define XCHAL_NMI_VECTOR_VADDR XC_VADDR(XCHAL_NMI_VECOFS) + +#undef XCHAL_INTLEVEL7_VECTOR_VADDR +#define XCHAL_INTLEVEL7_VECTOR_VADDR XC_VADDR(XCHAL_INTLEVEL7_VECOFS) + +/* + * These XCHAL_* #defines from varian/core.h + * are not valid to use with V3 MMU. Non-XCHAL + * constants are defined above and should be used. + */ +#undef XCHAL_VECBASE_RESET_VADDR +#undef XCHAL_RESET_VECTOR0_VADDR +#undef XCHAL_USER_VECTOR_VADDR +#undef XCHAL_KERNEL_VECTOR_VADDR +#undef XCHAL_DOUBLEEXC_VECTOR_VADDR +#undef XCHAL_WINDOW_VECTORS_VADDR +#undef XCHAL_INTLEVEL2_VECTOR_VADDR +#undef XCHAL_INTLEVEL3_VECTOR_VADDR +#undef XCHAL_INTLEVEL4_VECTOR_VADDR +#undef XCHAL_INTLEVEL5_VECTOR_VADDR +#undef XCHAL_INTLEVEL6_VECTOR_VADDR +#undef XCHAL_DEBUG_VECTOR_VADDR +#undef XCHAL_NMI_VECTOR_VADDR +#undef XCHAL_INTLEVEL7_VECTOR_VADDR + +#else + +#define USER_VECTOR_VADDR XCHAL_USER_VECTOR_VADDR +#define KERNEL_VECTOR_VADDR XCHAL_KERNEL_VECTOR_VADDR +#define DOUBLEEXC_VECTOR_VADDR XCHAL_DOUBLEEXC_VECTOR_VADDR +#define WINDOW_VECTORS_VADDR XCHAL_WINDOW_VECTORS_VADDR +#define INTLEVEL2_VECTOR_VADDR XCHAL_INTLEVEL2_VECTOR_VADDR +#define INTLEVEL3_VECTOR_VADDR XCHAL_INTLEVEL3_VECTOR_VADDR +#define INTLEVEL4_VECTOR_VADDR XCHAL_INTLEVEL4_VECTOR_VADDR +#define INTLEVEL5_VECTOR_VADDR XCHAL_INTLEVEL5_VECTOR_VADDR +#define INTLEVEL6_VECTOR_VADDR XCHAL_INTLEVEL6_VECTOR_VADDR +#define DEBUG_VECTOR_VADDR XCHAL_DEBUG_VECTOR_VADDR + +#endif + +#endif /* _XTENSA_VECTORS_H */ diff --git a/arch/xtensa/kernel/Makefile b/arch/xtensa/kernel/Makefile index c3a59d992ac0..1e7fc87a94bb 100644 --- a/arch/xtensa/kernel/Makefile +++ b/arch/xtensa/kernel/Makefile @@ -4,14 +4,16 @@ extra-y := head.o vmlinux.lds -obj-y := align.o entry.o irq.o coprocessor.o process.o ptrace.o \ - setup.o signal.o syscall.o time.o traps.o vectors.o platform.o \ - pci-dma.o +obj-y := align.o coprocessor.o entry.o irq.o pci-dma.o platform.o process.o \ + ptrace.o setup.o signal.o stacktrace.o syscall.o time.o traps.o \ + vectors.o obj-$(CONFIG_KGDB) += xtensa-stub.o obj-$(CONFIG_PCI) += pci.o obj-$(CONFIG_MODULES) += xtensa_ksyms.o module.o +AFLAGS_head.o += -mtext-section-literals + # In the Xtensa architecture, assembly generates literals which must always # precede the L32R instruction with a relative offset less than 256 kB. # Therefore, the .text and .literal section must be combined in parenthesis diff --git a/arch/xtensa/kernel/entry.S b/arch/xtensa/kernel/entry.S index 63845f950792..5082507d5631 100644 --- a/arch/xtensa/kernel/entry.S +++ b/arch/xtensa/kernel/entry.S @@ -354,16 +354,16 @@ common_exception: * so we can allow exceptions and interrupts (*) again. * Set PS(EXCM = 0, UM = 0, RING = 0, OWB = 0, WOE = 1, INTLEVEL = X) * - * (*) We only allow interrupts of higher priority than current IRQ + * (*) We only allow interrupts if they were previously enabled and + * we're not handling an IRQ */ rsr a3, ps - addi a0, a0, -4 - movi a2, 1 + addi a0, a0, -EXCCAUSE_LEVEL1_INTERRUPT + movi a2, LOCKLEVEL extui a3, a3, PS_INTLEVEL_SHIFT, PS_INTLEVEL_WIDTH # a3 = PS.INTLEVEL - movnez a2, a3, a3 # a2 = 1: level-1, > 1: high priority - moveqz a3, a2, a0 # a3 = IRQ level iff interrupt + moveqz a3, a2, a0 # a3 = LOCKLEVEL iff interrupt movi a2, 1 << PS_WOE_BIT or a3, a3, a2 rsr a0, exccause @@ -389,6 +389,22 @@ common_exception: save_xtregs_opt a1 a2 a4 a5 a6 a7 PT_XTREGS_OPT +#ifdef CONFIG_TRACE_IRQFLAGS + l32i a4, a1, PT_DEPC + /* Double exception means we came here with an exception + * while PS.EXCM was set, i.e. interrupts disabled. + */ + bgeui a4, VALID_DOUBLE_EXCEPTION_ADDRESS, 1f + l32i a4, a1, PT_EXCCAUSE + bnei a4, EXCCAUSE_LEVEL1_INTERRUPT, 1f + /* We came here with an interrupt means interrupts were enabled + * and we've just disabled them. + */ + movi a4, trace_hardirqs_off + callx4 a4 +1: +#endif + /* Go to second-level dispatcher. Set up parameters to pass to the * exception handler and call the exception handler. */ @@ -407,11 +423,29 @@ common_exception: .global common_exception_return common_exception_return: +#ifdef CONFIG_TRACE_IRQFLAGS + l32i a4, a1, PT_DEPC + /* Double exception means we came here with an exception + * while PS.EXCM was set, i.e. interrupts disabled. + */ + bgeui a4, VALID_DOUBLE_EXCEPTION_ADDRESS, 1f + l32i a4, a1, PT_EXCCAUSE + bnei a4, EXCCAUSE_LEVEL1_INTERRUPT, 1f + /* We came here with an interrupt means interrupts were enabled + * and we'll reenable them on return. + */ + movi a4, trace_hardirqs_on + callx4 a4 +1: +#endif + /* Jump if we are returning from kernel exceptions. */ 1: l32i a3, a1, PT_PS _bbci.l a3, PS_UM_BIT, 4f + rsil a2, 0 + /* Specific to a user exception exit: * We need to check some flags for signal handling and rescheduling, * and have to restore WB and WS, extra states, and all registers @@ -652,51 +686,19 @@ common_exception_exit: l32i a0, a1, PT_DEPC l32i a3, a1, PT_AREG3 - _bltui a0, VALID_DOUBLE_EXCEPTION_ADDRESS, 1f - - wsr a0, depc l32i a2, a1, PT_AREG2 - l32i a0, a1, PT_AREG0 - l32i a1, a1, PT_AREG1 - rfde + _bgeui a0, VALID_DOUBLE_EXCEPTION_ADDRESS, 1f -1: /* Restore a0...a3 and return */ - rsr a0, ps - extui a2, a0, PS_INTLEVEL_SHIFT, PS_INTLEVEL_WIDTH - movi a0, 2f - slli a2, a2, 4 - add a0, a2, a0 - l32i a2, a1, PT_AREG2 - jx a0 - - .macro irq_exit_level level - .align 16 - .if XCHAL_EXCM_LEVEL >= \level - l32i a0, a1, PT_PC - wsr a0, epc\level l32i a0, a1, PT_AREG0 l32i a1, a1, PT_AREG1 - rfi \level - .endif - .endm + rfe - .align 16 -2: +1: wsr a0, depc l32i a0, a1, PT_AREG0 l32i a1, a1, PT_AREG1 - rfe - - .align 16 - /* no rfi for level-1 irq, handled by rfe above*/ - nop - - irq_exit_level 2 - irq_exit_level 3 - irq_exit_level 4 - irq_exit_level 5 - irq_exit_level 6 + rfde ENDPROC(kernel_exception) diff --git a/arch/xtensa/kernel/head.S b/arch/xtensa/kernel/head.S index df88f98737f4..ef12c0e6fa25 100644 --- a/arch/xtensa/kernel/head.S +++ b/arch/xtensa/kernel/head.S @@ -48,17 +48,36 @@ */ __HEAD + .begin no-absolute-literals + ENTRY(_start) - _j 2f + /* Preserve the pointer to the boot parameter list in EXCSAVE_1 */ + wsr a2, excsave1 + _j _SetupMMU + + .align 4 + .literal_position +.Lstartup: + .word _startup + .align 4 -1: .word _startup -2: l32r a0, 1b + .global _SetupMMU +_SetupMMU: + Offset = _SetupMMU - _start + +#ifdef CONFIG_INITIALIZE_XTENSA_MMU_INSIDE_VMLINUX + initialize_mmu +#endif + .end no-absolute-literals + + l32r a0, .Lstartup jx a0 ENDPROC(_start) - .section .init.text, "ax" + __INIT + .literal_position ENTRY(_startup) @@ -67,10 +86,6 @@ ENTRY(_startup) movi a0, LOCKLEVEL wsr a0, ps - /* Preserve the pointer to the boot parameter list in EXCSAVE_1 */ - - wsr a2, excsave1 - /* Start with a fresh windowbase and windowstart. */ movi a1, 1 @@ -86,7 +101,9 @@ ENTRY(_startup) /* Clear debugging registers. */ #if XCHAL_HAVE_DEBUG +#if XCHAL_NUM_IBREAK > 0 wsr a0, ibreakenable +#endif wsr a0, icount movi a1, 15 wsr a0, icountlevel @@ -156,8 +173,6 @@ ENTRY(_startup) isync - initialize_mmu - /* Unpack data sections * * The linker script used to build the Linux kernel image @@ -205,6 +220,10 @@ ENTRY(_startup) ___flush_dcache_all a2 a3 #endif + memw + isync + ___invalidate_icache_all a2 a3 + isync /* Setup stack and enable window exceptions (keep irqs disabled) */ diff --git a/arch/xtensa/kernel/platform.c b/arch/xtensa/kernel/platform.c index 44bf21c3769a..2bd6c351f37c 100644 --- a/arch/xtensa/kernel/platform.c +++ b/arch/xtensa/kernel/platform.c @@ -36,6 +36,7 @@ _F(void, power_off, (void), { while(1); }); _F(void, idle, (void), { __asm__ __volatile__ ("waiti 0" ::: "memory"); }); _F(void, heartbeat, (void), { }); _F(int, pcibios_fixup, (void), { return 0; }); +_F(void, pcibios_init, (void), { }); #ifdef CONFIG_XTENSA_CALIBRATE_CCOUNT _F(void, calibrate_ccount, (void), diff --git a/arch/xtensa/kernel/stacktrace.c b/arch/xtensa/kernel/stacktrace.c new file mode 100644 index 000000000000..7d2c317bd98b --- /dev/null +++ b/arch/xtensa/kernel/stacktrace.c @@ -0,0 +1,120 @@ +/* + * arch/xtensa/kernel/stacktrace.c + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2001 - 2013 Tensilica Inc. + */ +#include +#include +#include + +#include +#include + +void walk_stackframe(unsigned long *sp, + int (*fn)(struct stackframe *frame, void *data), + void *data) +{ + unsigned long a0, a1; + unsigned long sp_end; + + a1 = (unsigned long)sp; + sp_end = ALIGN(a1, THREAD_SIZE); + + spill_registers(); + + while (a1 < sp_end) { + struct stackframe frame; + + sp = (unsigned long *)a1; + + a0 = *(sp - 4); + a1 = *(sp - 3); + + if (a1 <= (unsigned long)sp) + break; + + frame.pc = MAKE_PC_FROM_RA(a0, a1); + frame.sp = a1; + + if (fn(&frame, data)) + return; + } +} + +#ifdef CONFIG_STACKTRACE + +struct stack_trace_data { + struct stack_trace *trace; + unsigned skip; +}; + +static int stack_trace_cb(struct stackframe *frame, void *data) +{ + struct stack_trace_data *trace_data = data; + struct stack_trace *trace = trace_data->trace; + + if (trace_data->skip) { + --trace_data->skip; + return 0; + } + if (!kernel_text_address(frame->pc)) + return 0; + + trace->entries[trace->nr_entries++] = frame->pc; + return trace->nr_entries >= trace->max_entries; +} + +void save_stack_trace_tsk(struct task_struct *task, struct stack_trace *trace) +{ + struct stack_trace_data trace_data = { + .trace = trace, + .skip = trace->skip, + }; + walk_stackframe(stack_pointer(task), stack_trace_cb, &trace_data); +} +EXPORT_SYMBOL_GPL(save_stack_trace_tsk); + +void save_stack_trace(struct stack_trace *trace) +{ + save_stack_trace_tsk(current, trace); +} +EXPORT_SYMBOL_GPL(save_stack_trace); + +#endif + +#ifdef CONFIG_FRAME_POINTER + +struct return_addr_data { + unsigned long addr; + unsigned skip; +}; + +static int return_address_cb(struct stackframe *frame, void *data) +{ + struct return_addr_data *r = data; + + if (r->skip) { + --r->skip; + return 0; + } + if (!kernel_text_address(frame->pc)) + return 0; + r->addr = frame->pc; + return 1; +} + +unsigned long return_address(unsigned level) +{ + struct return_addr_data r = { + .skip = level + 1, + }; + walk_stackframe(stack_pointer(NULL), return_address_cb, &r); + return r.addr; +} +EXPORT_SYMBOL(return_address); + +#endif diff --git a/arch/xtensa/kernel/traps.c b/arch/xtensa/kernel/traps.c index 458186dab5dc..3e8a05c874cd 100644 --- a/arch/xtensa/kernel/traps.c +++ b/arch/xtensa/kernel/traps.c @@ -11,7 +11,7 @@ * * Essentially rewritten for the Xtensa architecture port. * - * Copyright (C) 2001 - 2005 Tensilica Inc. + * Copyright (C) 2001 - 2013 Tensilica Inc. * * Joe Taylor * Chris Zankel @@ -32,6 +32,7 @@ #include #include +#include #include #include #include @@ -195,7 +196,6 @@ void do_multihit(struct pt_regs *regs, unsigned long exccause) /* * IRQ handler. - * PS.INTLEVEL is the current IRQ priority level. */ extern void do_IRQ(int, struct pt_regs *); @@ -212,18 +212,21 @@ void do_interrupt(struct pt_regs *regs) XCHAL_INTLEVEL6_MASK, XCHAL_INTLEVEL7_MASK, }; - unsigned level = get_sr(ps) & PS_INTLEVEL_MASK; - - if (WARN_ON_ONCE(level >= ARRAY_SIZE(int_level_mask))) - return; for (;;) { unsigned intread = get_sr(interrupt); unsigned intenable = get_sr(intenable); - unsigned int_at_level = intread & intenable & - int_level_mask[level]; + unsigned int_at_level = intread & intenable; + unsigned level; + + for (level = LOCKLEVEL; level > 0; --level) { + if (int_at_level & int_level_mask[level]) { + int_at_level &= int_level_mask[level]; + break; + } + } - if (!int_at_level) + if (level == 0) return; /* @@ -404,53 +407,25 @@ void show_regs(struct pt_regs * regs) regs->syscall); } -static __always_inline unsigned long *stack_pointer(struct task_struct *task) +static int show_trace_cb(struct stackframe *frame, void *data) { - unsigned long *sp; - - if (!task || task == current) - __asm__ __volatile__ ("mov %0, a1\n" : "=a"(sp)); - else - sp = (unsigned long *)task->thread.sp; - - return sp; + if (kernel_text_address(frame->pc)) { + printk(" [<%08lx>] ", frame->pc); + print_symbol("%s\n", frame->pc); + } + return 0; } void show_trace(struct task_struct *task, unsigned long *sp) { - unsigned long a0, a1, pc; - unsigned long sp_start, sp_end; - - if (sp) - a1 = (unsigned long)sp; - else - a1 = (unsigned long)stack_pointer(task); - - sp_start = a1 & ~(THREAD_SIZE-1); - sp_end = sp_start + THREAD_SIZE; + if (!sp) + sp = stack_pointer(task); printk("Call Trace:"); #ifdef CONFIG_KALLSYMS printk("\n"); #endif - spill_registers(); - - while (a1 > sp_start && a1 < sp_end) { - sp = (unsigned long*)a1; - - a0 = *(sp - 4); - a1 = *(sp - 3); - - if (a1 <= (unsigned long) sp) - break; - - pc = MAKE_PC_FROM_RA(a0, a1); - - if (kernel_text_address(pc)) { - printk(" [<%08lx>] ", pc); - print_symbol("%s\n", pc); - } - } + walk_stackframe(sp, show_trace_cb, NULL); printk("\n"); } diff --git a/arch/xtensa/kernel/vectors.S b/arch/xtensa/kernel/vectors.S index 82109b42e240..f9e175382aa9 100644 --- a/arch/xtensa/kernel/vectors.S +++ b/arch/xtensa/kernel/vectors.S @@ -50,6 +50,7 @@ #include #include #include +#include #define WINDOW_VECTORS_SIZE 0x180 @@ -220,7 +221,7 @@ ENTRY(_DoubleExceptionVector) xsr a0, depc # get DEPC, save a0 - movi a3, XCHAL_WINDOW_VECTORS_VADDR + movi a3, WINDOW_VECTORS_VADDR _bltu a0, a3, .Lfixup addi a3, a3, WINDOW_VECTORS_SIZE _bgeu a0, a3, .Lfixup @@ -385,9 +386,12 @@ ENDPROC(_DebugInterruptVector) .if XCHAL_EXCM_LEVEL >= \level .section .Level\level\()InterruptVector.text, "ax" ENTRY(_Level\level\()InterruptVector) - wsr a0, epc1 + wsr a0, excsave2 rsr a0, epc\level - xsr a0, epc1 + wsr a0, epc1 + movi a0, EXCCAUSE_LEVEL1_INTERRUPT + wsr a0, exccause + rsr a0, eps\level # branch to user or kernel vector j _SimulateUserKernelVectorException .endif @@ -439,10 +443,8 @@ ENDPROC(_WindowOverflow4) */ .align 4 _SimulateUserKernelVectorException: - wsr a0, excsave2 - movi a0, 4 # LEVEL1_INTERRUPT cause - wsr a0, exccause - rsr a0, ps + addi a0, a0, (1 << PS_EXCM_BIT) + wsr a0, ps bbsi.l a0, PS_UM_BIT, 1f # branch if user mode rsr a0, excsave2 # restore a0 j _KernelExceptionVector # simulate kernel vector exception diff --git a/arch/xtensa/kernel/vmlinux.lds.S b/arch/xtensa/kernel/vmlinux.lds.S index 14695240536d..21acd11b5df2 100644 --- a/arch/xtensa/kernel/vmlinux.lds.S +++ b/arch/xtensa/kernel/vmlinux.lds.S @@ -18,6 +18,7 @@ #include #include +#include #include #include OUTPUT_ARCH(xtensa) @@ -30,7 +31,7 @@ jiffies = jiffies_64; #endif #ifndef KERNELOFFSET -#define KERNELOFFSET 0xd0001000 +#define KERNELOFFSET 0xd0003000 #endif /* Note: In the following macros, it would be nice to specify only the @@ -185,16 +186,16 @@ SECTIONS SECTION_VECTOR (_WindowVectors_text, .WindowVectors.text, - XCHAL_WINDOW_VECTORS_VADDR, 4, + WINDOW_VECTORS_VADDR, 4, .dummy) SECTION_VECTOR (_DebugInterruptVector_literal, .DebugInterruptVector.literal, - XCHAL_DEBUG_VECTOR_VADDR - 4, + DEBUG_VECTOR_VADDR - 4, SIZEOF(.WindowVectors.text), .WindowVectors.text) SECTION_VECTOR (_DebugInterruptVector_text, .DebugInterruptVector.text, - XCHAL_DEBUG_VECTOR_VADDR, + DEBUG_VECTOR_VADDR, 4, .DebugInterruptVector.literal) #undef LAST @@ -202,7 +203,7 @@ SECTIONS #if XCHAL_EXCM_LEVEL >= 2 SECTION_VECTOR (_Level2InterruptVector_text, .Level2InterruptVector.text, - XCHAL_INTLEVEL2_VECTOR_VADDR, + INTLEVEL2_VECTOR_VADDR, SIZEOF(LAST), LAST) # undef LAST # define LAST .Level2InterruptVector.text @@ -210,7 +211,7 @@ SECTIONS #if XCHAL_EXCM_LEVEL >= 3 SECTION_VECTOR (_Level3InterruptVector_text, .Level3InterruptVector.text, - XCHAL_INTLEVEL3_VECTOR_VADDR, + INTLEVEL3_VECTOR_VADDR, SIZEOF(LAST), LAST) # undef LAST # define LAST .Level3InterruptVector.text @@ -218,7 +219,7 @@ SECTIONS #if XCHAL_EXCM_LEVEL >= 4 SECTION_VECTOR (_Level4InterruptVector_text, .Level4InterruptVector.text, - XCHAL_INTLEVEL4_VECTOR_VADDR, + INTLEVEL4_VECTOR_VADDR, SIZEOF(LAST), LAST) # undef LAST # define LAST .Level4InterruptVector.text @@ -226,7 +227,7 @@ SECTIONS #if XCHAL_EXCM_LEVEL >= 5 SECTION_VECTOR (_Level5InterruptVector_text, .Level5InterruptVector.text, - XCHAL_INTLEVEL5_VECTOR_VADDR, + INTLEVEL5_VECTOR_VADDR, SIZEOF(LAST), LAST) # undef LAST # define LAST .Level5InterruptVector.text @@ -234,39 +235,39 @@ SECTIONS #if XCHAL_EXCM_LEVEL >= 6 SECTION_VECTOR (_Level6InterruptVector_text, .Level6InterruptVector.text, - XCHAL_INTLEVEL6_VECTOR_VADDR, + INTLEVEL6_VECTOR_VADDR, SIZEOF(LAST), LAST) # undef LAST # define LAST .Level6InterruptVector.text #endif SECTION_VECTOR (_KernelExceptionVector_literal, .KernelExceptionVector.literal, - XCHAL_KERNEL_VECTOR_VADDR - 4, + KERNEL_VECTOR_VADDR - 4, SIZEOF(LAST), LAST) #undef LAST SECTION_VECTOR (_KernelExceptionVector_text, .KernelExceptionVector.text, - XCHAL_KERNEL_VECTOR_VADDR, + KERNEL_VECTOR_VADDR, 4, .KernelExceptionVector.literal) SECTION_VECTOR (_UserExceptionVector_literal, .UserExceptionVector.literal, - XCHAL_USER_VECTOR_VADDR - 4, + USER_VECTOR_VADDR - 4, SIZEOF(.KernelExceptionVector.text), .KernelExceptionVector.text) SECTION_VECTOR (_UserExceptionVector_text, .UserExceptionVector.text, - XCHAL_USER_VECTOR_VADDR, + USER_VECTOR_VADDR, 4, .UserExceptionVector.literal) SECTION_VECTOR (_DoubleExceptionVector_literal, .DoubleExceptionVector.literal, - XCHAL_DOUBLEEXC_VECTOR_VADDR - 16, + DOUBLEEXC_VECTOR_VADDR - 16, SIZEOF(.UserExceptionVector.text), .UserExceptionVector.text) SECTION_VECTOR (_DoubleExceptionVector_text, .DoubleExceptionVector.text, - XCHAL_DOUBLEEXC_VECTOR_VADDR, + DOUBLEEXC_VECTOR_VADDR, 32, .DoubleExceptionVector.literal) @@ -284,11 +285,26 @@ SECTIONS . = ALIGN(0x10); .bootstrap : { *(.bootstrap.literal .bootstrap.text .bootstrap.data) } - .ResetVector.text XCHAL_RESET_VECTOR_VADDR : + .ResetVector.text RESET_VECTOR_VADDR : { *(.ResetVector.text) } + + /* + * This is a remapped copy of the Secondary Reset Vector Code. + * It keeps gdb in sync with the PC after switching + * to the temporary mapping used while setting up + * the V2 MMU mappings for Linux. + * + * Only debug information about this section is put in the kernel image. + */ + .SecondaryResetVector.remapped_text 0x46000000 (INFO): + { + *(.SecondaryResetVector.remapped_text) + } + + .xt.lit : { *(.xt.lit) } .xt.prop : { *(.xt.prop) } diff --git a/arch/xtensa/kernel/xtensa_ksyms.c b/arch/xtensa/kernel/xtensa_ksyms.c index afe058b24e6e..42c53c87c204 100644 --- a/arch/xtensa/kernel/xtensa_ksyms.c +++ b/arch/xtensa/kernel/xtensa_ksyms.c @@ -119,3 +119,8 @@ EXPORT_SYMBOL(outsl); EXPORT_SYMBOL(insb); EXPORT_SYMBOL(insw); EXPORT_SYMBOL(insl); + +extern long common_exception_return; +extern long _spill_registers; +EXPORT_SYMBOL(common_exception_return); +EXPORT_SYMBOL(_spill_registers); diff --git a/arch/xtensa/mm/mmu.c b/arch/xtensa/mm/mmu.c index 0f77f9d3bb8b..a1077570e383 100644 --- a/arch/xtensa/mm/mmu.c +++ b/arch/xtensa/mm/mmu.c @@ -24,15 +24,19 @@ void __init paging_init(void) */ void __init init_mmu(void) { - /* Writing zeros to the TLBCFG special registers ensure - * that valid values exist in the register. For existing - * PGSZID fields, zero selects the first element of the - * page-size array. For nonexistent PGSZID fields, zero is - * the best value to write. Also, when changing PGSZID +#if !(XCHAL_HAVE_PTP_MMU && XCHAL_HAVE_SPANNING_WAY) + /* + * Writing zeros to the instruction and data TLBCFG special + * registers ensure that valid values exist in the register. + * + * For existing PGSZID fields, zero selects the first element + * of the page-size array. For nonexistent PGSZID fields, + * zero is the best value to write. Also, when changing PGSZID * fields, the corresponding TLB must be flushed. */ set_itlbcfg_register(0); set_dtlbcfg_register(0); +#endif flush_tlb_all(); /* Set rasid register to a known value. */ diff --git a/arch/xtensa/oprofile/backtrace.c b/arch/xtensa/oprofile/backtrace.c index 66f32ee2c982..5f03a593d84f 100644 --- a/arch/xtensa/oprofile/backtrace.c +++ b/arch/xtensa/oprofile/backtrace.c @@ -132,9 +132,7 @@ static void xtensa_backtrace_kernel(struct pt_regs *regs, unsigned int depth) pc = MAKE_PC_FROM_RA(a0, pc); /* Add the PC to the trace. */ - if (kernel_text_address(pc)) - oprofile_add_trace(pc); - + oprofile_add_trace(pc); if (pc == (unsigned long) &common_exception_return) { regs = (struct pt_regs *)a1; if (user_mode(regs)) { diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c index da9866f7fecf..70cb408bc20d 100644 --- a/arch/xtensa/platforms/iss/console.c +++ b/arch/xtensa/platforms/iss/console.c @@ -56,13 +56,13 @@ static void rs_poll(unsigned long); static int rs_open(struct tty_struct *tty, struct file * filp) { tty->port = &serial_port; - spin_lock(&timer_lock); + spin_lock_bh(&timer_lock); if (tty->count == 1) { setup_timer(&serial_timer, rs_poll, (unsigned long)&serial_port); mod_timer(&serial_timer, jiffies + SERIAL_TIMER_VALUE); } - spin_unlock(&timer_lock); + spin_unlock_bh(&timer_lock); return 0; } @@ -99,14 +99,13 @@ static int rs_write(struct tty_struct * tty, static void rs_poll(unsigned long priv) { struct tty_port *port = (struct tty_port *)priv; - struct timeval tv = { .tv_sec = 0, .tv_usec = 0 }; int i = 0; unsigned char c; spin_lock(&timer_lock); - while (__simc(SYS_select_one, 0, XTISS_SELECT_ONE_READ, (int)&tv,0,0)){ - __simc (SYS_read, 0, (unsigned long)&c, 1, 0, 0); + while (simc_poll(0)) { + simc_read(0, &c, 1); tty_insert_flip_char(port, c, TTY_NORMAL); i++; } @@ -244,8 +243,7 @@ static void iss_console_write(struct console *co, const char *s, unsigned count) int len = strlen(s); if (s != 0 && *s != 0) - __simc (SYS_write, 1, (unsigned long)s, - count < len ? count : len,0,0); + simc_write(1, s, count < len ? count : len); } static struct tty_driver* iss_console_device(struct console *c, int *index) diff --git a/arch/xtensa/platforms/iss/include/platform/simcall.h b/arch/xtensa/platforms/iss/include/platform/simcall.h index b5a4edf02d76..12b15ad1e586 100644 --- a/arch/xtensa/platforms/iss/include/platform/simcall.h +++ b/arch/xtensa/platforms/iss/include/platform/simcall.h @@ -59,56 +59,58 @@ static int errno; -static inline int __simc(int a, int b, int c, int d, int e, int f) +static inline int __simc(int a, int b, int c, int d) { int ret; register int a1 asm("a2") = a; register int b1 asm("a3") = b; register int c1 asm("a4") = c; register int d1 asm("a5") = d; - register int e1 asm("a6") = e; - register int f1 asm("a7") = f; __asm__ __volatile__ ( "simcall\n" "mov %0, a2\n" "mov %1, a3\n" : "=a" (ret), "=a" (errno), "+r"(a1), "+r"(b1) - : "r"(c1), "r"(d1), "r"(e1), "r"(f1) + : "r"(c1), "r"(d1) : "memory"); return ret; } static inline int simc_open(const char *file, int flags, int mode) { - return __simc(SYS_open, (int) file, flags, mode, 0, 0); + return __simc(SYS_open, (int) file, flags, mode); } static inline int simc_close(int fd) { - return __simc(SYS_close, fd, 0, 0, 0, 0); + return __simc(SYS_close, fd, 0, 0); } static inline int simc_ioctl(int fd, int request, void *arg) { - return __simc(SYS_ioctl, fd, request, (int) arg, 0, 0); + return __simc(SYS_ioctl, fd, request, (int) arg); } static inline int simc_read(int fd, void *buf, size_t count) { - return __simc(SYS_read, fd, (int) buf, count, 0, 0); + return __simc(SYS_read, fd, (int) buf, count); } static inline int simc_write(int fd, const void *buf, size_t count) { - return __simc(SYS_write, fd, (int) buf, count, 0, 0); + return __simc(SYS_write, fd, (int) buf, count); } static inline int simc_poll(int fd) { struct timeval tv = { .tv_sec = 0, .tv_usec = 0 }; - return __simc(SYS_select_one, fd, XTISS_SELECT_ONE_READ, (int)&tv, - 0, 0); + return __simc(SYS_select_one, fd, XTISS_SELECT_ONE_READ, (int)&tv); +} + +static inline int simc_lseek(int fd, uint32_t off, int whence) +{ + return __simc(SYS_lseek, fd, off, whence); } #endif /* _XTENSA_PLATFORM_ISS_SIMCALL_H */ diff --git a/arch/xtensa/platforms/iss/setup.c b/arch/xtensa/platforms/iss/setup.c index e1700102f35e..da7d18240866 100644 --- a/arch/xtensa/platforms/iss/setup.c +++ b/arch/xtensa/platforms/iss/setup.c @@ -38,12 +38,6 @@ void __init platform_init(bp_tag_t* bootparam) } -#ifdef CONFIG_PCI -void platform_pcibios_init(void) -{ -} -#endif - void platform_halt(void) { pr_info(" ** Called platform_halt() **\n"); @@ -64,7 +58,9 @@ void platform_restart(void) "wsr a2, icountlevel\n\t" "movi a2, 0\n\t" "wsr a2, icount\n\t" +#if XCHAL_NUM_IBREAK > 0 "wsr a2, ibreakenable\n\t" +#endif "wsr a2, lcount\n\t" "movi a2, 0x1f\n\t" "wsr a2, ps\n\t" diff --git a/arch/xtensa/platforms/iss/simdisk.c b/arch/xtensa/platforms/iss/simdisk.c index 0345f43d34f3..c0edb35424ce 100644 --- a/arch/xtensa/platforms/iss/simdisk.c +++ b/arch/xtensa/platforms/iss/simdisk.c @@ -85,7 +85,7 @@ static void simdisk_transfer(struct simdisk *dev, unsigned long sector, while (nbytes > 0) { unsigned long io; - __simc(SYS_lseek, dev->fd, offset, SEEK_SET, 0, 0); + simc_lseek(dev->fd, offset, SEEK_SET); if (write) io = simc_write(dev->fd, buffer, nbytes); else @@ -176,7 +176,7 @@ static int simdisk_attach(struct simdisk *dev, const char *filename) err = -ENODEV; goto out; } - dev->size = __simc(SYS_lseek, dev->fd, 0, SEEK_END, 0, 0); + dev->size = simc_lseek(dev->fd, 0, SEEK_END); set_capacity(dev->gd, dev->size >> SECTOR_SHIFT); dev->filename = filename; pr_info("SIMDISK: %s=%s\n", dev->gd->disk_name, dev->filename); @@ -217,7 +217,7 @@ static ssize_t proc_read_simdisk(struct file *file, char __user *buf, size_t size, loff_t *ppos) { struct simdisk *dev = PDE_DATA(file_inode(file)); - char *s = dev->filename; + const char *s = dev->filename; if (s) { ssize_t n = simple_read_from_buffer(buf, size, ppos, s, strlen(s)); @@ -238,7 +238,7 @@ static ssize_t proc_write_simdisk(struct file *file, const char __user *buf, if (tmp == NULL) return -ENOMEM; - if (copy_from_user(tmp, buffer, count)) { + if (copy_from_user(tmp, buf, count)) { err = -EFAULT; goto out_free; } diff --git a/arch/xtensa/platforms/xt2000/setup.c b/arch/xtensa/platforms/xt2000/setup.c index c7d90f17886e..f9bc87966290 100644 --- a/arch/xtensa/platforms/xt2000/setup.c +++ b/arch/xtensa/platforms/xt2000/setup.c @@ -69,7 +69,9 @@ void platform_restart(void) "wsr a2, icountlevel\n\t" "movi a2, 0\n\t" "wsr a2, icount\n\t" +#if XCHAL_NUM_IBREAK > 0 "wsr a2, ibreakenable\n\t" +#endif "wsr a2, lcount\n\t" "movi a2, 0x1f\n\t" "wsr a2, ps\n\t" diff --git a/arch/xtensa/platforms/xtfpga/setup.c b/arch/xtensa/platforms/xtfpga/setup.c index 9d888a2a5755..96ef8eeb064e 100644 --- a/arch/xtensa/platforms/xtfpga/setup.c +++ b/arch/xtensa/platforms/xtfpga/setup.c @@ -60,7 +60,9 @@ void platform_restart(void) "wsr a2, icountlevel\n\t" "movi a2, 0\n\t" "wsr a2, icount\n\t" +#if XCHAL_NUM_IBREAK > 0 "wsr a2, ibreakenable\n\t" +#endif "wsr a2, lcount\n\t" "movi a2, 0x1f\n\t" "wsr a2, ps\n\t" diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index b2b9837f9dd3..e8918ffaf96d 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -972,10 +972,10 @@ int blkcg_activate_policy(struct request_queue *q, if (!new_blkg) return -ENOMEM; - preloaded = !radix_tree_preload(GFP_KERNEL); - blk_queue_bypass_start(q); + preloaded = !radix_tree_preload(GFP_KERNEL); + /* * Make sure the root blkg exists and count the existing blkgs. As * @q is bypassing at this point, blkg_lookup_create() can't be diff --git a/block/blk-core.c b/block/blk-core.c index 7c288358a745..33c33bc99ddd 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -30,6 +30,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -159,20 +160,10 @@ static void req_bio_endio(struct request *rq, struct bio *bio, else if (!test_bit(BIO_UPTODATE, &bio->bi_flags)) error = -EIO; - if (unlikely(nbytes > bio->bi_size)) { - printk(KERN_ERR "%s: want %u bytes done, %u left\n", - __func__, nbytes, bio->bi_size); - nbytes = bio->bi_size; - } - if (unlikely(rq->cmd_flags & REQ_QUIET)) set_bit(BIO_QUIET, &bio->bi_flags); - bio->bi_size -= nbytes; - bio->bi_sector += (nbytes >> 9); - - if (bio_integrity(bio)) - bio_integrity_advance(bio, nbytes); + bio_advance(bio, nbytes); /* don't actually finish bio if it's part of flush sequence */ if (bio->bi_size == 0 && !(rq->cmd_flags & REQ_FLUSH_SEQ)) @@ -1264,6 +1255,16 @@ void part_round_stats(int cpu, struct hd_struct *part) } EXPORT_SYMBOL_GPL(part_round_stats); +#ifdef CONFIG_PM_RUNTIME +static void blk_pm_put_request(struct request *rq) +{ + if (rq->q->dev && !(rq->cmd_flags & REQ_PM) && !--rq->q->nr_pending) + pm_runtime_mark_last_busy(rq->q->dev); +} +#else +static inline void blk_pm_put_request(struct request *rq) {} +#endif + /* * queue lock must be held */ @@ -1274,6 +1275,8 @@ void __blk_put_request(struct request_queue *q, struct request *req) if (unlikely(--req->ref_count)) return; + blk_pm_put_request(req); + elv_completed_request(q, req); /* this is a bio leak */ @@ -1597,7 +1600,7 @@ static void handle_bad_sector(struct bio *bio) printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n", bdevname(bio->bi_bdev, b), bio->bi_rw, - (unsigned long long)bio->bi_sector + bio_sectors(bio), + (unsigned long long)bio_end_sector(bio), (long long)(i_size_read(bio->bi_bdev->bd_inode) >> 9)); set_bit(BIO_EOF, &bio->bi_flags); @@ -2053,6 +2056,28 @@ static void blk_account_io_done(struct request *req) } } +#ifdef CONFIG_PM_RUNTIME +/* + * Don't process normal requests when queue is suspended + * or in the process of suspending/resuming + */ +static struct request *blk_pm_peek_request(struct request_queue *q, + struct request *rq) +{ + if (q->dev && (q->rpm_status == RPM_SUSPENDED || + (q->rpm_status != RPM_ACTIVE && !(rq->cmd_flags & REQ_PM)))) + return NULL; + else + return rq; +} +#else +static inline struct request *blk_pm_peek_request(struct request_queue *q, + struct request *rq) +{ + return rq; +} +#endif + /** * blk_peek_request - peek at the top of a request queue * @q: request queue to peek at @@ -2075,6 +2100,11 @@ struct request *blk_peek_request(struct request_queue *q) int ret; while ((rq = __elv_next_request(q)) != NULL) { + + rq = blk_pm_peek_request(q, rq); + if (!rq) + break; + if (!(rq->cmd_flags & REQ_STARTED)) { /* * This is the first time the device driver @@ -2253,8 +2283,7 @@ EXPORT_SYMBOL(blk_fetch_request); **/ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes) { - int total_bytes, bio_nbytes, next_idx = 0; - struct bio *bio; + int total_bytes; if (!req->bio) return false; @@ -2300,56 +2329,21 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes) blk_account_io_completion(req, nr_bytes); - total_bytes = bio_nbytes = 0; - while ((bio = req->bio) != NULL) { - int nbytes; + total_bytes = 0; + while (req->bio) { + struct bio *bio = req->bio; + unsigned bio_bytes = min(bio->bi_size, nr_bytes); - if (nr_bytes >= bio->bi_size) { + if (bio_bytes == bio->bi_size) req->bio = bio->bi_next; - nbytes = bio->bi_size; - req_bio_endio(req, bio, nbytes, error); - next_idx = 0; - bio_nbytes = 0; - } else { - int idx = bio->bi_idx + next_idx; - if (unlikely(idx >= bio->bi_vcnt)) { - blk_dump_rq_flags(req, "__end_that"); - printk(KERN_ERR "%s: bio idx %d >= vcnt %d\n", - __func__, idx, bio->bi_vcnt); - break; - } + req_bio_endio(req, bio, bio_bytes, error); - nbytes = bio_iovec_idx(bio, idx)->bv_len; - BIO_BUG_ON(nbytes > bio->bi_size); + total_bytes += bio_bytes; + nr_bytes -= bio_bytes; - /* - * not a complete bvec done - */ - if (unlikely(nbytes > nr_bytes)) { - bio_nbytes += nr_bytes; - total_bytes += nr_bytes; - break; - } - - /* - * advance to the next vector - */ - next_idx++; - bio_nbytes += nbytes; - } - - total_bytes += nbytes; - nr_bytes -= nbytes; - - bio = req->bio; - if (bio) { - /* - * end more in this run, or just return 'not-done' - */ - if (unlikely(nr_bytes <= 0)) - break; - } + if (!nr_bytes) + break; } /* @@ -2365,16 +2359,6 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes) return false; } - /* - * if the request wasn't completed, update state - */ - if (bio_nbytes) { - req_bio_endio(req, bio, bio_nbytes, error); - bio->bi_idx += next_idx; - bio_iovec(bio)->bv_offset += nr_bytes; - bio_iovec(bio)->bv_len -= nr_bytes; - } - req->__data_len -= total_bytes; req->buffer = bio_data(req->bio); @@ -3046,6 +3030,149 @@ void blk_finish_plug(struct blk_plug *plug) } EXPORT_SYMBOL(blk_finish_plug); +#ifdef CONFIG_PM_RUNTIME +/** + * blk_pm_runtime_init - Block layer runtime PM initialization routine + * @q: the queue of the device + * @dev: the device the queue belongs to + * + * Description: + * Initialize runtime-PM-related fields for @q and start auto suspend for + * @dev. Drivers that want to take advantage of request-based runtime PM + * should call this function after @dev has been initialized, and its + * request queue @q has been allocated, and runtime PM for it can not happen + * yet(either due to disabled/forbidden or its usage_count > 0). In most + * cases, driver should call this function before any I/O has taken place. + * + * This function takes care of setting up using auto suspend for the device, + * the autosuspend delay is set to -1 to make runtime suspend impossible + * until an updated value is either set by user or by driver. Drivers do + * not need to touch other autosuspend settings. + * + * The block layer runtime PM is request based, so only works for drivers + * that use request as their IO unit instead of those directly use bio's. + */ +void blk_pm_runtime_init(struct request_queue *q, struct device *dev) +{ + q->dev = dev; + q->rpm_status = RPM_ACTIVE; + pm_runtime_set_autosuspend_delay(q->dev, -1); + pm_runtime_use_autosuspend(q->dev); +} +EXPORT_SYMBOL(blk_pm_runtime_init); + +/** + * blk_pre_runtime_suspend - Pre runtime suspend check + * @q: the queue of the device + * + * Description: + * This function will check if runtime suspend is allowed for the device + * by examining if there are any requests pending in the queue. If there + * are requests pending, the device can not be runtime suspended; otherwise, + * the queue's status will be updated to SUSPENDING and the driver can + * proceed to suspend the device. + * + * For the not allowed case, we mark last busy for the device so that + * runtime PM core will try to autosuspend it some time later. + * + * This function should be called near the start of the device's + * runtime_suspend callback. + * + * Return: + * 0 - OK to runtime suspend the device + * -EBUSY - Device should not be runtime suspended + */ +int blk_pre_runtime_suspend(struct request_queue *q) +{ + int ret = 0; + + spin_lock_irq(q->queue_lock); + if (q->nr_pending) { + ret = -EBUSY; + pm_runtime_mark_last_busy(q->dev); + } else { + q->rpm_status = RPM_SUSPENDING; + } + spin_unlock_irq(q->queue_lock); + return ret; +} +EXPORT_SYMBOL(blk_pre_runtime_suspend); + +/** + * blk_post_runtime_suspend - Post runtime suspend processing + * @q: the queue of the device + * @err: return value of the device's runtime_suspend function + * + * Description: + * Update the queue's runtime status according to the return value of the + * device's runtime suspend function and mark last busy for the device so + * that PM core will try to auto suspend the device at a later time. + * + * This function should be called near the end of the device's + * runtime_suspend callback. + */ +void blk_post_runtime_suspend(struct request_queue *q, int err) +{ + spin_lock_irq(q->queue_lock); + if (!err) { + q->rpm_status = RPM_SUSPENDED; + } else { + q->rpm_status = RPM_ACTIVE; + pm_runtime_mark_last_busy(q->dev); + } + spin_unlock_irq(q->queue_lock); +} +EXPORT_SYMBOL(blk_post_runtime_suspend); + +/** + * blk_pre_runtime_resume - Pre runtime resume processing + * @q: the queue of the device + * + * Description: + * Update the queue's runtime status to RESUMING in preparation for the + * runtime resume of the device. + * + * This function should be called near the start of the device's + * runtime_resume callback. + */ +void blk_pre_runtime_resume(struct request_queue *q) +{ + spin_lock_irq(q->queue_lock); + q->rpm_status = RPM_RESUMING; + spin_unlock_irq(q->queue_lock); +} +EXPORT_SYMBOL(blk_pre_runtime_resume); + +/** + * blk_post_runtime_resume - Post runtime resume processing + * @q: the queue of the device + * @err: return value of the device's runtime_resume function + * + * Description: + * Update the queue's runtime status according to the return value of the + * device's runtime_resume function. If it is successfully resumed, process + * the requests that are queued into the device's queue when it is resuming + * and then mark last busy and initiate autosuspend for it. + * + * This function should be called near the end of the device's + * runtime_resume callback. + */ +void blk_post_runtime_resume(struct request_queue *q, int err) +{ + spin_lock_irq(q->queue_lock); + if (!err) { + q->rpm_status = RPM_ACTIVE; + __blk_run_queue(q); + pm_runtime_mark_last_busy(q->dev); + pm_runtime_autosuspend(q->dev); + } else { + q->rpm_status = RPM_SUSPENDED; + } + spin_unlock_irq(q->queue_lock); +} +EXPORT_SYMBOL(blk_post_runtime_resume); +#endif + int __init blk_dev_init(void) { BUILD_BUG_ON(__REQ_NR_BITS > 8 * diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 4f0ade74cfd0..d5cd3131c57a 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -2270,11 +2270,8 @@ cfq_find_rq_fmerge(struct cfq_data *cfqd, struct bio *bio) return NULL; cfqq = cic_to_cfqq(cic, cfq_bio_sync(bio)); - if (cfqq) { - sector_t sector = bio->bi_sector + bio_sectors(bio); - - return elv_rb_find(&cfqq->sort_list, sector); - } + if (cfqq) + return elv_rb_find(&cfqq->sort_list, bio_end_sector(bio)); return NULL; } diff --git a/block/deadline-iosched.c b/block/deadline-iosched.c index 90037b5eb17f..ba19a3afab79 100644 --- a/block/deadline-iosched.c +++ b/block/deadline-iosched.c @@ -132,7 +132,7 @@ deadline_merge(struct request_queue *q, struct request **req, struct bio *bio) * check for front merge */ if (dd->front_merges) { - sector_t sector = bio->bi_sector + bio_sectors(bio); + sector_t sector = bio_end_sector(bio); __rq = elv_rb_find(&dd->sort_list[bio_data_dir(bio)], sector); if (__rq) { diff --git a/block/elevator.c b/block/elevator.c index a0ffdd943c98..eba5b04c29b1 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -34,6 +34,7 @@ #include #include #include +#include #include @@ -536,6 +537,27 @@ void elv_bio_merged(struct request_queue *q, struct request *rq, e->type->ops.elevator_bio_merged_fn(q, rq, bio); } +#ifdef CONFIG_PM_RUNTIME +static void blk_pm_requeue_request(struct request *rq) +{ + if (rq->q->dev && !(rq->cmd_flags & REQ_PM)) + rq->q->nr_pending--; +} + +static void blk_pm_add_request(struct request_queue *q, struct request *rq) +{ + if (q->dev && !(rq->cmd_flags & REQ_PM) && q->nr_pending++ == 0 && + (q->rpm_status == RPM_SUSPENDED || q->rpm_status == RPM_SUSPENDING)) + pm_request_resume(q->dev); +} +#else +static inline void blk_pm_requeue_request(struct request *rq) {} +static inline void blk_pm_add_request(struct request_queue *q, + struct request *rq) +{ +} +#endif + void elv_requeue_request(struct request_queue *q, struct request *rq) { /* @@ -550,6 +572,8 @@ void elv_requeue_request(struct request_queue *q, struct request *rq) rq->cmd_flags &= ~REQ_STARTED; + blk_pm_requeue_request(rq); + __elv_add_request(q, rq, ELEVATOR_INSERT_REQUEUE); } @@ -572,6 +596,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) { trace_block_rq_insert(q, rq); + blk_pm_add_request(q, rq); + rq->q = q; if (rq->cmd_flags & REQ_SOFTBARRIER) { diff --git a/block/partitions/efi.c b/block/partitions/efi.c index ff5804e2f1d2..c85fc895ecdb 100644 --- a/block/partitions/efi.c +++ b/block/partitions/efi.c @@ -238,7 +238,7 @@ static gpt_entry *alloc_read_gpt_entries(struct parsed_partitions *state, le32_to_cpu(gpt->sizeof_partition_entry); if (!count) return NULL; - pte = kzalloc(count, GFP_KERNEL); + pte = kmalloc(count, GFP_KERNEL); if (!pte) return NULL; @@ -267,7 +267,7 @@ static gpt_header *alloc_read_gpt_header(struct parsed_partitions *state, gpt_header *gpt; unsigned ssz = bdev_logical_block_size(state->bdev); - gpt = kzalloc(ssz, GFP_KERNEL); + gpt = kmalloc(ssz, GFP_KERNEL); if (!gpt) return NULL; diff --git a/drivers/acpi/acpica/exfldio.c b/drivers/acpi/acpica/exfldio.c index ec7f5690031b..c84ee956fa4c 100644 --- a/drivers/acpi/acpica/exfldio.c +++ b/drivers/acpi/acpica/exfldio.c @@ -720,7 +720,19 @@ acpi_ex_extract_from_field(union acpi_operand_object *obj_desc, if ((obj_desc->common_field.start_field_bit_offset == 0) && (obj_desc->common_field.bit_length == access_bit_width)) { - status = acpi_ex_field_datum_io(obj_desc, 0, buffer, ACPI_READ); + if (buffer_length >= sizeof(u64)) { + status = + acpi_ex_field_datum_io(obj_desc, 0, buffer, + ACPI_READ); + } else { + /* Use raw_datum (u64) to handle buffers < 64 bits */ + + status = + acpi_ex_field_datum_io(obj_desc, 0, &raw_datum, + ACPI_READ); + ACPI_MEMCPY(buffer, &raw_datum, buffer_length); + } + return_ACPI_STATUS(status); } diff --git a/drivers/acpi/acpica/nsinit.c b/drivers/acpi/acpica/nsinit.c index 2a431ec50a25..46f0f83417a1 100644 --- a/drivers/acpi/acpica/nsinit.c +++ b/drivers/acpi/acpica/nsinit.c @@ -558,6 +558,7 @@ acpi_ns_init_one_device(acpi_handle obj_handle, ACPI_DEBUG_EXEC(acpi_ut_display_init_pathname (ACPI_TYPE_METHOD, device_node, METHOD_NAME__INI)); + ACPI_MEMSET(info, 0, sizeof(struct acpi_evaluate_info)); info->prefix_node = device_node; info->pathname = METHOD_NAME__INI; info->parameters = NULL; diff --git a/drivers/acpi/acpica/utosi.c b/drivers/acpi/acpica/utosi.c index b15acebb96a1..7e807725c636 100644 --- a/drivers/acpi/acpica/utosi.c +++ b/drivers/acpi/acpica/utosi.c @@ -349,7 +349,8 @@ acpi_status acpi_ut_osi_implementation(struct acpi_walk_state * walk_state) return_value = 0; status = acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER); if (ACPI_FAILURE(status)) { - return (status); + acpi_ut_remove_reference(return_desc); + return_ACPI_STATUS(status); } /* Lookup the interface in the global _OSI list */ diff --git a/drivers/bcma/driver_mips.c b/drivers/bcma/driver_mips.c index 9a7f0e3ab5a3..11115bbe115c 100644 --- a/drivers/bcma/driver_mips.c +++ b/drivers/bcma/driver_mips.c @@ -21,7 +21,7 @@ #include #include -static const char *part_probes[] = { "bcm47xxpart", NULL }; +static const char * const part_probes[] = { "bcm47xxpart", NULL }; static struct physmap_flash_data bcma_pflash_data = { .part_probe_types = part_probes, diff --git a/drivers/block/Makefile b/drivers/block/Makefile index a3b40232c6ab..ca07399a8d99 100644 --- a/drivers/block/Makefile +++ b/drivers/block/Makefile @@ -42,4 +42,5 @@ obj-$(CONFIG_BLK_DEV_PCIESSD_MTIP32XX) += mtip32xx/ obj-$(CONFIG_BLK_DEV_RSXX) += rsxx/ +nvme-y := nvme-core.o nvme-scsi.o swim_mod-y := swim.o swim_asm.o diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 92b6d7c51e39..fc803ecbbce4 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -920,16 +920,14 @@ bio_pagedec(struct bio *bio) static void bufinit(struct buf *buf, struct request *rq, struct bio *bio) { - struct bio_vec *bv; - memset(buf, 0, sizeof(*buf)); buf->rq = rq; buf->bio = bio; buf->resid = bio->bi_size; buf->sector = bio->bi_sector; bio_pageinc(bio); - buf->bv = bv = &bio->bi_io_vec[bio->bi_idx]; - buf->bv_resid = bv->bv_len; + buf->bv = bio_iovec(bio); + buf->bv_resid = buf->bv->bv_len; WARN_ON(buf->bv_resid == 0); } diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 531ceb31d0ff..f1a29f8e9d33 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -334,8 +334,7 @@ static void brd_make_request(struct request_queue *q, struct bio *bio) int err = -EIO; sector = bio->bi_sector; - if (sector + (bio->bi_size >> SECTOR_SHIFT) > - get_capacity(bdev->bd_disk)) + if (bio_end_sector(bio) > get_capacity(bdev->bd_disk)) goto out; if (unlikely(bio->bi_rw & REQ_DISCARD)) { diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index 94b51c5e0678..6374dc103521 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -75,6 +75,12 @@ module_param(cciss_simple_mode, int, S_IRUGO|S_IWUSR); MODULE_PARM_DESC(cciss_simple_mode, "Use 'simple mode' rather than 'performant mode'"); +static int cciss_allow_hpsa; +module_param(cciss_allow_hpsa, int, S_IRUGO|S_IWUSR); +MODULE_PARM_DESC(cciss_allow_hpsa, + "Prevent cciss driver from accessing hardware known to be " + " supported by the hpsa driver"); + static DEFINE_MUTEX(cciss_mutex); static struct proc_dir_entry *proc_cciss; @@ -4115,9 +4121,13 @@ static int cciss_lookup_board_id(struct pci_dev *pdev, u32 *board_id) *board_id = ((subsystem_device_id << 16) & 0xffff0000) | subsystem_vendor_id; - for (i = 0; i < ARRAY_SIZE(products); i++) + for (i = 0; i < ARRAY_SIZE(products); i++) { + /* Stand aside for hpsa driver on request */ + if (cciss_allow_hpsa) + return -ENODEV; if (*board_id == products[i].board_id) return i; + } dev_warn(&pdev->dev, "unrecognized board ID: 0x%08x, ignoring.\n", *board_id); return -ENODEV; @@ -4959,6 +4969,16 @@ static int cciss_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) ctlr_info_t *h; unsigned long flags; + /* + * By default the cciss driver is used for all older HP Smart Array + * controllers. There are module paramaters that allow a user to + * override this behavior and instead use the hpsa SCSI driver. If + * this is the case cciss may be loaded first from the kdump initrd + * image and cause a kernel panic. So if reset_devices is true and + * cciss_allow_hpsa is set just bail. + */ + if ((reset_devices) && (cciss_allow_hpsa == 1)) + return -ENODEV; rc = cciss_init_reset_devices(pdev); if (rc) { if (rc != -ENOTSUPP) diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c index 92510f8ad013..6608076dc39e 100644 --- a/drivers/block/drbd/drbd_actlog.c +++ b/drivers/block/drbd/drbd_actlog.c @@ -104,7 +104,6 @@ struct update_al_work { int err; }; -static int al_write_transaction(struct drbd_conf *mdev); void *drbd_md_get_buffer(struct drbd_conf *mdev) { @@ -168,7 +167,11 @@ static int _drbd_md_sync_page_io(struct drbd_conf *mdev, bio->bi_end_io = drbd_md_io_complete; bio->bi_rw = rw; - if (!get_ldev_if_state(mdev, D_ATTACHING)) { /* Corresponding put_ldev in drbd_md_io_complete() */ + if (!(rw & WRITE) && mdev->state.disk == D_DISKLESS && mdev->ldev == NULL) + /* special case, drbd_md_read() during drbd_adm_attach(): no get_ldev */ + ; + else if (!get_ldev_if_state(mdev, D_ATTACHING)) { + /* Corresponding put_ldev in drbd_md_io_complete() */ dev_err(DEV, "ASSERT FAILED: get_ldev_if_state() == 1 in _drbd_md_sync_page_io()\n"); err = -ENODEV; goto out; @@ -199,9 +202,10 @@ int drbd_md_sync_page_io(struct drbd_conf *mdev, struct drbd_backing_dev *bdev, BUG_ON(!bdev->md_bdev); - dev_dbg(DEV, "meta_data io: %s [%d]:%s(,%llus,%s)\n", + dev_dbg(DEV, "meta_data io: %s [%d]:%s(,%llus,%s) %pS\n", current->comm, current->pid, __func__, - (unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ"); + (unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ", + (void*)_RET_IP_ ); if (sector < drbd_md_first_sector(bdev) || sector + 7 > drbd_md_last_sector(bdev)) @@ -209,7 +213,8 @@ int drbd_md_sync_page_io(struct drbd_conf *mdev, struct drbd_backing_dev *bdev, current->comm, current->pid, __func__, (unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ"); - err = _drbd_md_sync_page_io(mdev, bdev, iop, sector, rw, MD_BLOCK_SIZE); + /* we do all our meta data IO in aligned 4k blocks. */ + err = _drbd_md_sync_page_io(mdev, bdev, iop, sector, rw, 4096); if (err) { dev_err(DEV, "drbd_md_sync_page_io(,%llus,%s) failed with error %d\n", (unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ", err); @@ -217,44 +222,99 @@ int drbd_md_sync_page_io(struct drbd_conf *mdev, struct drbd_backing_dev *bdev, return err; } -static struct lc_element *_al_get(struct drbd_conf *mdev, unsigned int enr) +static struct bm_extent *find_active_resync_extent(struct drbd_conf *mdev, unsigned int enr) { - struct lc_element *al_ext; struct lc_element *tmp; - int wake; - - spin_lock_irq(&mdev->al_lock); tmp = lc_find(mdev->resync, enr/AL_EXT_PER_BM_SECT); if (unlikely(tmp != NULL)) { struct bm_extent *bm_ext = lc_entry(tmp, struct bm_extent, lce); - if (test_bit(BME_NO_WRITES, &bm_ext->flags)) { - wake = !test_and_set_bit(BME_PRIORITY, &bm_ext->flags); - spin_unlock_irq(&mdev->al_lock); - if (wake) - wake_up(&mdev->al_wait); - return NULL; - } + if (test_bit(BME_NO_WRITES, &bm_ext->flags)) + return bm_ext; + } + return NULL; +} + +static struct lc_element *_al_get(struct drbd_conf *mdev, unsigned int enr, bool nonblock) +{ + struct lc_element *al_ext; + struct bm_extent *bm_ext; + int wake; + + spin_lock_irq(&mdev->al_lock); + bm_ext = find_active_resync_extent(mdev, enr); + if (bm_ext) { + wake = !test_and_set_bit(BME_PRIORITY, &bm_ext->flags); + spin_unlock_irq(&mdev->al_lock); + if (wake) + wake_up(&mdev->al_wait); + return NULL; } - al_ext = lc_get(mdev->act_log, enr); + if (nonblock) + al_ext = lc_try_get(mdev->act_log, enr); + else + al_ext = lc_get(mdev->act_log, enr); spin_unlock_irq(&mdev->al_lock); return al_ext; } -void drbd_al_begin_io(struct drbd_conf *mdev, struct drbd_interval *i) +bool drbd_al_begin_io_fastpath(struct drbd_conf *mdev, struct drbd_interval *i) { /* for bios crossing activity log extent boundaries, * we may need to activate two extents in one go */ unsigned first = i->sector >> (AL_EXTENT_SHIFT-9); unsigned last = i->size == 0 ? first : (i->sector + (i->size >> 9) - 1) >> (AL_EXTENT_SHIFT-9); - unsigned enr; - bool locked = false; + D_ASSERT((unsigned)(last - first) <= 1); + D_ASSERT(atomic_read(&mdev->local_cnt) > 0); + + /* FIXME figure out a fast path for bios crossing AL extent boundaries */ + if (first != last) + return false; + + return _al_get(mdev, first, true); +} + +bool drbd_al_begin_io_prepare(struct drbd_conf *mdev, struct drbd_interval *i) +{ + /* for bios crossing activity log extent boundaries, + * we may need to activate two extents in one go */ + unsigned first = i->sector >> (AL_EXTENT_SHIFT-9); + unsigned last = i->size == 0 ? first : (i->sector + (i->size >> 9) - 1) >> (AL_EXTENT_SHIFT-9); + unsigned enr; + bool need_transaction = false; D_ASSERT(first <= last); D_ASSERT(atomic_read(&mdev->local_cnt) > 0); - for (enr = first; enr <= last; enr++) - wait_event(mdev->al_wait, _al_get(mdev, enr) != NULL); + for (enr = first; enr <= last; enr++) { + struct lc_element *al_ext; + wait_event(mdev->al_wait, + (al_ext = _al_get(mdev, enr, false)) != NULL); + if (al_ext->lc_number != enr) + need_transaction = true; + } + return need_transaction; +} + +static int al_write_transaction(struct drbd_conf *mdev, bool delegate); + +/* When called through generic_make_request(), we must delegate + * activity log I/O to the worker thread: a further request + * submitted via generic_make_request() within the same task + * would be queued on current->bio_list, and would only start + * after this function returns (see generic_make_request()). + * + * However, if we *are* the worker, we must not delegate to ourselves. + */ + +/* + * @delegate: delegate activity log I/O to the worker thread + */ +void drbd_al_begin_io_commit(struct drbd_conf *mdev, bool delegate) +{ + bool locked = false; + + BUG_ON(delegate && current == mdev->tconn->worker.task); /* Serialize multiple transactions. * This uses test_and_set_bit, memory barrier is implicit. @@ -264,13 +324,6 @@ void drbd_al_begin_io(struct drbd_conf *mdev, struct drbd_interval *i) (locked = lc_try_lock_for_transaction(mdev->act_log))); if (locked) { - /* drbd_al_write_transaction(mdev,al_ext,enr); - * recurses into generic_make_request(), which - * disallows recursion, bios being serialized on the - * current->bio_tail list now. - * we have to delegate updates to the activity log - * to the worker thread. */ - /* Double check: it may have been committed by someone else, * while we have been waiting for the lock. */ if (mdev->act_log->pending_changes) { @@ -280,11 +333,8 @@ void drbd_al_begin_io(struct drbd_conf *mdev, struct drbd_interval *i) write_al_updates = rcu_dereference(mdev->ldev->disk_conf)->al_updates; rcu_read_unlock(); - if (write_al_updates) { - al_write_transaction(mdev); - mdev->al_writ_cnt++; - } - + if (write_al_updates) + al_write_transaction(mdev, delegate); spin_lock_irq(&mdev->al_lock); /* FIXME if (err) @@ -298,6 +348,66 @@ void drbd_al_begin_io(struct drbd_conf *mdev, struct drbd_interval *i) } } +/* + * @delegate: delegate activity log I/O to the worker thread + */ +void drbd_al_begin_io(struct drbd_conf *mdev, struct drbd_interval *i, bool delegate) +{ + BUG_ON(delegate && current == mdev->tconn->worker.task); + + if (drbd_al_begin_io_prepare(mdev, i)) + drbd_al_begin_io_commit(mdev, delegate); +} + +int drbd_al_begin_io_nonblock(struct drbd_conf *mdev, struct drbd_interval *i) +{ + struct lru_cache *al = mdev->act_log; + /* for bios crossing activity log extent boundaries, + * we may need to activate two extents in one go */ + unsigned first = i->sector >> (AL_EXTENT_SHIFT-9); + unsigned last = i->size == 0 ? first : (i->sector + (i->size >> 9) - 1) >> (AL_EXTENT_SHIFT-9); + unsigned nr_al_extents; + unsigned available_update_slots; + unsigned enr; + + D_ASSERT(first <= last); + + nr_al_extents = 1 + last - first; /* worst case: all touched extends are cold. */ + available_update_slots = min(al->nr_elements - al->used, + al->max_pending_changes - al->pending_changes); + + /* We want all necessary updates for a given request within the same transaction + * We could first check how many updates are *actually* needed, + * and use that instead of the worst-case nr_al_extents */ + if (available_update_slots < nr_al_extents) + return -EWOULDBLOCK; + + /* Is resync active in this area? */ + for (enr = first; enr <= last; enr++) { + struct lc_element *tmp; + tmp = lc_find(mdev->resync, enr/AL_EXT_PER_BM_SECT); + if (unlikely(tmp != NULL)) { + struct bm_extent *bm_ext = lc_entry(tmp, struct bm_extent, lce); + if (test_bit(BME_NO_WRITES, &bm_ext->flags)) { + if (!test_and_set_bit(BME_PRIORITY, &bm_ext->flags)) + return -EBUSY; + return -EWOULDBLOCK; + } + } + } + + /* Checkout the refcounts. + * Given that we checked for available elements and update slots above, + * this has to be successful. */ + for (enr = first; enr <= last; enr++) { + struct lc_element *al_ext; + al_ext = lc_get_cumulative(mdev->act_log, enr); + if (!al_ext) + dev_info(DEV, "LOGIC BUG for enr=%u\n", enr); + } + return 0; +} + void drbd_al_complete_io(struct drbd_conf *mdev, struct drbd_interval *i) { /* for bios crossing activity log extent boundaries, @@ -350,6 +460,24 @@ static unsigned int rs_extent_to_bm_page(unsigned int rs_enr) (BM_EXT_SHIFT - BM_BLOCK_SHIFT)); } +static sector_t al_tr_number_to_on_disk_sector(struct drbd_conf *mdev) +{ + const unsigned int stripes = mdev->ldev->md.al_stripes; + const unsigned int stripe_size_4kB = mdev->ldev->md.al_stripe_size_4k; + + /* transaction number, modulo on-disk ring buffer wrap around */ + unsigned int t = mdev->al_tr_number % (mdev->ldev->md.al_size_4k); + + /* ... to aligned 4k on disk block */ + t = ((t % stripes) * stripe_size_4kB) + t/stripes; + + /* ... to 512 byte sector in activity log */ + t *= 8; + + /* ... plus offset to the on disk position */ + return mdev->ldev->md.md_offset + mdev->ldev->md.al_offset + t; +} + static int _al_write_transaction(struct drbd_conf *mdev) { @@ -432,23 +560,27 @@ _al_write_transaction(struct drbd_conf *mdev) if (mdev->al_tr_cycle >= mdev->act_log->nr_elements) mdev->al_tr_cycle = 0; - sector = mdev->ldev->md.md_offset - + mdev->ldev->md.al_offset - + mdev->al_tr_pos * (MD_BLOCK_SIZE>>9); + sector = al_tr_number_to_on_disk_sector(mdev); crc = crc32c(0, buffer, 4096); buffer->crc32c = cpu_to_be32(crc); if (drbd_bm_write_hinted(mdev)) err = -EIO; - /* drbd_chk_io_error done already */ - else if (drbd_md_sync_page_io(mdev, mdev->ldev, sector, WRITE)) { - err = -EIO; - drbd_chk_io_error(mdev, 1, DRBD_META_IO_ERROR); - } else { - /* advance ringbuffer position and transaction counter */ - mdev->al_tr_pos = (mdev->al_tr_pos + 1) % (MD_AL_SECTORS*512/MD_BLOCK_SIZE); - mdev->al_tr_number++; + else { + bool write_al_updates; + rcu_read_lock(); + write_al_updates = rcu_dereference(mdev->ldev->disk_conf)->al_updates; + rcu_read_unlock(); + if (write_al_updates) { + if (drbd_md_sync_page_io(mdev, mdev->ldev, sector, WRITE)) { + err = -EIO; + drbd_chk_io_error(mdev, 1, DRBD_META_IO_ERROR); + } else { + mdev->al_tr_number++; + mdev->al_writ_cnt++; + } + } } drbd_md_put_buffer(mdev); @@ -474,20 +606,18 @@ static int w_al_write_transaction(struct drbd_work *w, int unused) /* Calls from worker context (see w_restart_disk_io()) need to write the transaction directly. Others came through generic_make_request(), those need to delegate it to the worker. */ -static int al_write_transaction(struct drbd_conf *mdev) +static int al_write_transaction(struct drbd_conf *mdev, bool delegate) { - struct update_al_work al_work; - - if (current == mdev->tconn->worker.task) + if (delegate) { + struct update_al_work al_work; + init_completion(&al_work.event); + al_work.w.cb = w_al_write_transaction; + al_work.w.mdev = mdev; + drbd_queue_work_front(&mdev->tconn->sender_work, &al_work.w); + wait_for_completion(&al_work.event); + return al_work.err; + } else return _al_write_transaction(mdev); - - init_completion(&al_work.event); - al_work.w.cb = w_al_write_transaction; - al_work.w.mdev = mdev; - drbd_queue_work_front(&mdev->tconn->sender_work, &al_work.w); - wait_for_completion(&al_work.event); - - return al_work.err; } static int _try_lc_del(struct drbd_conf *mdev, struct lc_element *al_ext) diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c index 8dc29502dc08..64fbb8385cdc 100644 --- a/drivers/block/drbd/drbd_bitmap.c +++ b/drivers/block/drbd/drbd_bitmap.c @@ -612,6 +612,17 @@ static void bm_memset(struct drbd_bitmap *b, size_t offset, int c, size_t len) } } +/* For the layout, see comment above drbd_md_set_sector_offsets(). */ +static u64 drbd_md_on_disk_bits(struct drbd_backing_dev *ldev) +{ + u64 bitmap_sectors; + if (ldev->md.al_offset == 8) + bitmap_sectors = ldev->md.md_size_sect - ldev->md.bm_offset; + else + bitmap_sectors = ldev->md.al_offset - ldev->md.bm_offset; + return bitmap_sectors << (9 + 3); +} + /* * make sure the bitmap has enough room for the attached storage, * if necessary, resize. @@ -668,7 +679,7 @@ int drbd_bm_resize(struct drbd_conf *mdev, sector_t capacity, int set_new_bits) words = ALIGN(bits, 64) >> LN2_BPL; if (get_ldev(mdev)) { - u64 bits_on_disk = ((u64)mdev->ldev->md.md_size_sect-MD_BM_OFFSET) << 12; + u64 bits_on_disk = drbd_md_on_disk_bits(mdev->ldev); put_ldev(mdev); if (bits > bits_on_disk) { dev_info(DEV, "bits = %lu\n", bits); diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h index 6b51afa1aae1..f943aacfdad8 100644 --- a/drivers/block/drbd/drbd_int.h +++ b/drivers/block/drbd/drbd_int.h @@ -753,13 +753,16 @@ struct drbd_md { u32 flags; u32 md_size_sect; - s32 al_offset; /* signed relative sector offset to al area */ + s32 al_offset; /* signed relative sector offset to activity log */ s32 bm_offset; /* signed relative sector offset to bitmap */ - /* u32 al_nr_extents; important for restoring the AL - * is stored into ldev->dc.al_extents, which in turn - * gets applied to act_log->nr_elements - */ + /* cached value of bdev->disk_conf->meta_dev_idx (see below) */ + s32 meta_dev_idx; + + /* see al_tr_number_to_on_disk_sector() */ + u32 al_stripes; + u32 al_stripe_size_4k; + u32 al_size_4k; /* cached product of the above */ }; struct drbd_backing_dev { @@ -891,6 +894,14 @@ struct drbd_tconn { /* is a resource from the config file */ } send; }; +struct submit_worker { + struct workqueue_struct *wq; + struct work_struct worker; + + spinlock_t lock; + struct list_head writes; +}; + struct drbd_conf { struct drbd_tconn *tconn; int vnr; /* volume number within the connection */ @@ -1009,7 +1020,6 @@ struct drbd_conf { struct lru_cache *act_log; /* activity log */ unsigned int al_tr_number; int al_tr_cycle; - int al_tr_pos; /* position of the next transaction in the journal */ wait_queue_head_t seq_wait; atomic_t packet_seq; unsigned int peer_seq; @@ -1032,6 +1042,10 @@ struct drbd_conf { atomic_t ap_in_flight; /* App sectors in flight (waiting for ack) */ unsigned int peer_max_bio_size; unsigned int local_max_bio_size; + + /* any requests that would block in drbd_make_request() + * are deferred to this single-threaded work queue */ + struct submit_worker submit; }; static inline struct drbd_conf *minor_to_mdev(unsigned int minor) @@ -1148,25 +1162,44 @@ extern int drbd_bitmap_io_from_worker(struct drbd_conf *mdev, char *why, enum bm_flag flags); extern int drbd_bmio_set_n_write(struct drbd_conf *mdev); extern int drbd_bmio_clear_n_write(struct drbd_conf *mdev); -extern void drbd_go_diskless(struct drbd_conf *mdev); extern void drbd_ldev_destroy(struct drbd_conf *mdev); /* Meta data layout - We reserve a 128MB Block (4k aligned) - * either at the end of the backing device - * or on a separate meta data device. */ + * + * We currently have two possible layouts. + * Offsets in (512 byte) sectors. + * external: + * |----------- md_size_sect ------------------| + * [ 4k superblock ][ activity log ][ Bitmap ] + * | al_offset == 8 | + * | bm_offset = al_offset + X | + * ==> bitmap sectors = md_size_sect - bm_offset + * + * Variants: + * old, indexed fixed size meta data: + * + * internal: + * |----------- md_size_sect ------------------| + * [data.....][ Bitmap ][ activity log ][ 4k superblock ][padding*] + * | al_offset < 0 | + * | bm_offset = al_offset - Y | + * ==> bitmap sectors = Y = al_offset - bm_offset + * + * [padding*] are zero or up to 7 unused 512 Byte sectors to the + * end of the device, so that the [4k superblock] will be 4k aligned. + * + * The activity log consists of 4k transaction blocks, + * which are written in a ring-buffer, or striped ring-buffer like fashion, + * which are writtensize used to be fixed 32kB, + * but is about to become configurable. + */ -/* The following numbers are sectors */ -/* Allows up to about 3.8TB, so if you want more, +/* Our old fixed size meta data layout + * allows up to about 3.8TB, so if you want more, * you need to use the "flexible" meta data format. */ -#define MD_RESERVED_SECT (128LU << 11) /* 128 MB, unit sectors */ -#define MD_AL_OFFSET 8 /* 8 Sectors after start of meta area */ -#define MD_AL_SECTORS 64 /* = 32 kB on disk activity log ring buffer */ -#define MD_BM_OFFSET (MD_AL_OFFSET + MD_AL_SECTORS) - -/* we do all meta data IO in 4k blocks */ -#define MD_BLOCK_SHIFT 12 -#define MD_BLOCK_SIZE (1<md.meta_dev_idx) { case DRBD_MD_INDEX_INTERNAL: case DRBD_MD_INDEX_FLEX_INT: return bdev->md.md_offset + bdev->md.bm_offset; @@ -1767,36 +1805,19 @@ static inline sector_t _drbd_md_first_sector(int meta_dev_idx, struct drbd_backi } } -static inline sector_t drbd_md_first_sector(struct drbd_backing_dev *bdev) -{ - int meta_dev_idx; - - rcu_read_lock(); - meta_dev_idx = rcu_dereference(bdev->disk_conf)->meta_dev_idx; - rcu_read_unlock(); - - return _drbd_md_first_sector(meta_dev_idx, bdev); -} - /** * drbd_md_last_sector() - Return the last sector number of the meta data area * @bdev: Meta data block device. */ static inline sector_t drbd_md_last_sector(struct drbd_backing_dev *bdev) { - int meta_dev_idx; - - rcu_read_lock(); - meta_dev_idx = rcu_dereference(bdev->disk_conf)->meta_dev_idx; - rcu_read_unlock(); - - switch (meta_dev_idx) { + switch (bdev->md.meta_dev_idx) { case DRBD_MD_INDEX_INTERNAL: case DRBD_MD_INDEX_FLEX_INT: - return bdev->md.md_offset + MD_AL_OFFSET - 1; + return bdev->md.md_offset + MD_4kB_SECT -1; case DRBD_MD_INDEX_FLEX_EXT: default: - return bdev->md.md_offset + bdev->md.md_size_sect; + return bdev->md.md_offset + bdev->md.md_size_sect -1; } } @@ -1818,18 +1839,13 @@ static inline sector_t drbd_get_capacity(struct block_device *bdev) static inline sector_t drbd_get_max_capacity(struct drbd_backing_dev *bdev) { sector_t s; - int meta_dev_idx; - rcu_read_lock(); - meta_dev_idx = rcu_dereference(bdev->disk_conf)->meta_dev_idx; - rcu_read_unlock(); - - switch (meta_dev_idx) { + switch (bdev->md.meta_dev_idx) { case DRBD_MD_INDEX_INTERNAL: case DRBD_MD_INDEX_FLEX_INT: s = drbd_get_capacity(bdev->backing_bdev) ? min_t(sector_t, DRBD_MAX_SECTORS_FLEX, - _drbd_md_first_sector(meta_dev_idx, bdev)) + drbd_md_first_sector(bdev)) : 0; break; case DRBD_MD_INDEX_FLEX_EXT: @@ -1848,39 +1864,24 @@ static inline sector_t drbd_get_max_capacity(struct drbd_backing_dev *bdev) } /** - * drbd_md_ss__() - Return the sector number of our meta data super block - * @mdev: DRBD device. + * drbd_md_ss() - Return the sector number of our meta data super block * @bdev: Meta data block device. */ -static inline sector_t drbd_md_ss__(struct drbd_conf *mdev, - struct drbd_backing_dev *bdev) +static inline sector_t drbd_md_ss(struct drbd_backing_dev *bdev) { - int meta_dev_idx; + const int meta_dev_idx = bdev->md.meta_dev_idx; - rcu_read_lock(); - meta_dev_idx = rcu_dereference(bdev->disk_conf)->meta_dev_idx; - rcu_read_unlock(); - - switch (meta_dev_idx) { - default: /* external, some index */ - return MD_RESERVED_SECT * meta_dev_idx; - case DRBD_MD_INDEX_INTERNAL: - /* with drbd08, internal meta data is always "flexible" */ - case DRBD_MD_INDEX_FLEX_INT: - /* sizeof(struct md_on_disk_07) == 4k - * position: last 4k aligned block of 4k size */ - if (!bdev->backing_bdev) { - if (__ratelimit(&drbd_ratelimit_state)) { - dev_err(DEV, "bdev->backing_bdev==NULL\n"); - dump_stack(); - } - return 0; - } - return (drbd_get_capacity(bdev->backing_bdev) & ~7ULL) - - MD_AL_OFFSET; - case DRBD_MD_INDEX_FLEX_EXT: + if (meta_dev_idx == DRBD_MD_INDEX_FLEX_EXT) return 0; - } + + /* Since drbd08, internal meta data is always "flexible". + * position: last 4k aligned block of 4k size */ + if (meta_dev_idx == DRBD_MD_INDEX_INTERNAL || + meta_dev_idx == DRBD_MD_INDEX_FLEX_INT) + return (drbd_get_capacity(bdev->backing_bdev) & ~7ULL) - 8; + + /* external, some index; this is the old fixed size layout */ + return MD_128MB_SECT * bdev->md.meta_dev_idx; } static inline void @@ -2053,9 +2054,11 @@ static inline void put_ldev(struct drbd_conf *mdev) if (mdev->state.disk == D_DISKLESS) /* even internal references gone, safe to destroy */ drbd_ldev_destroy(mdev); - if (mdev->state.disk == D_FAILED) + if (mdev->state.disk == D_FAILED) { /* all application IO references gone. */ - drbd_go_diskless(mdev); + if (!test_and_set_bit(GO_DISKLESS, &mdev->flags)) + drbd_queue_work(&mdev->tconn->sender_work, &mdev->go_diskless); + } wake_up(&mdev->misc_wait); } } diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 298b868910dc..a5dca6affcbb 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -45,7 +45,7 @@ #include #include #include - +#include #define __KERNEL_SYSCALLS__ #include #include @@ -2299,6 +2299,7 @@ static void drbd_cleanup(void) idr_for_each_entry(&minors, mdev, i) { idr_remove(&minors, mdev_to_minor(mdev)); idr_remove(&mdev->tconn->volumes, mdev->vnr); + destroy_workqueue(mdev->submit.wq); del_gendisk(mdev->vdisk); /* synchronize_rcu(); No other threads running at this point */ kref_put(&mdev->kref, &drbd_minor_destroy); @@ -2588,6 +2589,21 @@ void conn_destroy(struct kref *kref) kfree(tconn); } +int init_submitter(struct drbd_conf *mdev) +{ + /* opencoded create_singlethread_workqueue(), + * to be able to say "drbd%d", ..., minor */ + mdev->submit.wq = alloc_workqueue("drbd%u_submit", + WQ_UNBOUND | WQ_MEM_RECLAIM, 1, mdev->minor); + if (!mdev->submit.wq) + return -ENOMEM; + + INIT_WORK(&mdev->submit.worker, do_submit); + spin_lock_init(&mdev->submit.lock); + INIT_LIST_HEAD(&mdev->submit.writes); + return 0; +} + enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor, int vnr) { struct drbd_conf *mdev; @@ -2677,6 +2693,12 @@ enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor, goto out_idr_remove_minor; } + if (init_submitter(mdev)) { + err = ERR_NOMEM; + drbd_msg_put_info("unable to create submit workqueue"); + goto out_idr_remove_vol; + } + add_disk(disk); kref_init(&mdev->kref); /* one ref for both idrs and the the add_disk */ @@ -2687,6 +2709,8 @@ enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor, return NO_ERROR; +out_idr_remove_vol: + idr_remove(&tconn->volumes, vnr_got); out_idr_remove_minor: idr_remove(&minors, minor_got); synchronize_rcu(); @@ -2794,6 +2818,7 @@ void drbd_free_bc(struct drbd_backing_dev *ldev) blkdev_put(ldev->backing_bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL); blkdev_put(ldev->md_bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL); + kfree(ldev->disk_conf); kfree(ldev); } @@ -2833,8 +2858,9 @@ void conn_md_sync(struct drbd_tconn *tconn) rcu_read_unlock(); } +/* aligned 4kByte */ struct meta_data_on_disk { - u64 la_size; /* last agreed size. */ + u64 la_size_sect; /* last agreed size. */ u64 uuid[UI_SIZE]; /* UUIDs. */ u64 device_uuid; u64 reserved_u64_1; @@ -2842,13 +2868,17 @@ struct meta_data_on_disk { u32 magic; u32 md_size_sect; u32 al_offset; /* offset to this block */ - u32 al_nr_extents; /* important for restoring the AL */ + u32 al_nr_extents; /* important for restoring the AL (userspace) */ /* `-- act_log->nr_elements <-- ldev->dc.al_extents */ u32 bm_offset; /* offset to the bitmap, from here */ u32 bm_bytes_per_bit; /* BM_BLOCK_SIZE */ u32 la_peer_max_bio_size; /* last peer max_bio_size */ - u32 reserved_u32[3]; + /* see al_tr_number_to_on_disk_sector() */ + u32 al_stripes; + u32 al_stripe_size_4k; + + u8 reserved_u8[4096 - (7*8 + 10*4)]; } __packed; /** @@ -2861,6 +2891,10 @@ void drbd_md_sync(struct drbd_conf *mdev) sector_t sector; int i; + /* Don't accidentally change the DRBD meta data layout. */ + BUILD_BUG_ON(UI_SIZE != 4); + BUILD_BUG_ON(sizeof(struct meta_data_on_disk) != 4096); + del_timer(&mdev->md_sync_timer); /* timer may be rearmed by drbd_md_mark_dirty() now. */ if (!test_and_clear_bit(MD_DIRTY, &mdev->flags)) @@ -2875,9 +2909,9 @@ void drbd_md_sync(struct drbd_conf *mdev) if (!buffer) goto out; - memset(buffer, 0, 512); + memset(buffer, 0, sizeof(*buffer)); - buffer->la_size = cpu_to_be64(drbd_get_capacity(mdev->this_bdev)); + buffer->la_size_sect = cpu_to_be64(drbd_get_capacity(mdev->this_bdev)); for (i = UI_CURRENT; i < UI_SIZE; i++) buffer->uuid[i] = cpu_to_be64(mdev->ldev->md.uuid[i]); buffer->flags = cpu_to_be32(mdev->ldev->md.flags); @@ -2892,7 +2926,10 @@ void drbd_md_sync(struct drbd_conf *mdev) buffer->bm_offset = cpu_to_be32(mdev->ldev->md.bm_offset); buffer->la_peer_max_bio_size = cpu_to_be32(mdev->peer_max_bio_size); - D_ASSERT(drbd_md_ss__(mdev, mdev->ldev) == mdev->ldev->md.md_offset); + buffer->al_stripes = cpu_to_be32(mdev->ldev->md.al_stripes); + buffer->al_stripe_size_4k = cpu_to_be32(mdev->ldev->md.al_stripe_size_4k); + + D_ASSERT(drbd_md_ss(mdev->ldev) == mdev->ldev->md.md_offset); sector = mdev->ldev->md.md_offset; if (drbd_md_sync_page_io(mdev, mdev->ldev, sector, WRITE)) { @@ -2910,13 +2947,141 @@ out: put_ldev(mdev); } +static int check_activity_log_stripe_size(struct drbd_conf *mdev, + struct meta_data_on_disk *on_disk, + struct drbd_md *in_core) +{ + u32 al_stripes = be32_to_cpu(on_disk->al_stripes); + u32 al_stripe_size_4k = be32_to_cpu(on_disk->al_stripe_size_4k); + u64 al_size_4k; + + /* both not set: default to old fixed size activity log */ + if (al_stripes == 0 && al_stripe_size_4k == 0) { + al_stripes = 1; + al_stripe_size_4k = MD_32kB_SECT/8; + } + + /* some paranoia plausibility checks */ + + /* we need both values to be set */ + if (al_stripes == 0 || al_stripe_size_4k == 0) + goto err; + + al_size_4k = (u64)al_stripes * al_stripe_size_4k; + + /* Upper limit of activity log area, to avoid potential overflow + * problems in al_tr_number_to_on_disk_sector(). As right now, more + * than 72 * 4k blocks total only increases the amount of history, + * limiting this arbitrarily to 16 GB is not a real limitation ;-) */ + if (al_size_4k > (16 * 1024 * 1024/4)) + goto err; + + /* Lower limit: we need at least 8 transaction slots (32kB) + * to not break existing setups */ + if (al_size_4k < MD_32kB_SECT/8) + goto err; + + in_core->al_stripe_size_4k = al_stripe_size_4k; + in_core->al_stripes = al_stripes; + in_core->al_size_4k = al_size_4k; + + return 0; +err: + dev_err(DEV, "invalid activity log striping: al_stripes=%u, al_stripe_size_4k=%u\n", + al_stripes, al_stripe_size_4k); + return -EINVAL; +} + +static int check_offsets_and_sizes(struct drbd_conf *mdev, struct drbd_backing_dev *bdev) +{ + sector_t capacity = drbd_get_capacity(bdev->md_bdev); + struct drbd_md *in_core = &bdev->md; + s32 on_disk_al_sect; + s32 on_disk_bm_sect; + + /* The on-disk size of the activity log, calculated from offsets, and + * the size of the activity log calculated from the stripe settings, + * should match. + * Though we could relax this a bit: it is ok, if the striped activity log + * fits in the available on-disk activity log size. + * Right now, that would break how resize is implemented. + * TODO: make drbd_determine_dev_size() (and the drbdmeta tool) aware + * of possible unused padding space in the on disk layout. */ + if (in_core->al_offset < 0) { + if (in_core->bm_offset > in_core->al_offset) + goto err; + on_disk_al_sect = -in_core->al_offset; + on_disk_bm_sect = in_core->al_offset - in_core->bm_offset; + } else { + if (in_core->al_offset != MD_4kB_SECT) + goto err; + if (in_core->bm_offset < in_core->al_offset + in_core->al_size_4k * MD_4kB_SECT) + goto err; + + on_disk_al_sect = in_core->bm_offset - MD_4kB_SECT; + on_disk_bm_sect = in_core->md_size_sect - in_core->bm_offset; + } + + /* old fixed size meta data is exactly that: fixed. */ + if (in_core->meta_dev_idx >= 0) { + if (in_core->md_size_sect != MD_128MB_SECT + || in_core->al_offset != MD_4kB_SECT + || in_core->bm_offset != MD_4kB_SECT + MD_32kB_SECT + || in_core->al_stripes != 1 + || in_core->al_stripe_size_4k != MD_32kB_SECT/8) + goto err; + } + + if (capacity < in_core->md_size_sect) + goto err; + if (capacity - in_core->md_size_sect < drbd_md_first_sector(bdev)) + goto err; + + /* should be aligned, and at least 32k */ + if ((on_disk_al_sect & 7) || (on_disk_al_sect < MD_32kB_SECT)) + goto err; + + /* should fit (for now: exactly) into the available on-disk space; + * overflow prevention is in check_activity_log_stripe_size() above. */ + if (on_disk_al_sect != in_core->al_size_4k * MD_4kB_SECT) + goto err; + + /* again, should be aligned */ + if (in_core->bm_offset & 7) + goto err; + + /* FIXME check for device grow with flex external meta data? */ + + /* can the available bitmap space cover the last agreed device size? */ + if (on_disk_bm_sect < (in_core->la_size_sect+7)/MD_4kB_SECT/8/512) + goto err; + + return 0; + +err: + dev_err(DEV, "meta data offsets don't make sense: idx=%d " + "al_s=%u, al_sz4k=%u, al_offset=%d, bm_offset=%d, " + "md_size_sect=%u, la_size=%llu, md_capacity=%llu\n", + in_core->meta_dev_idx, + in_core->al_stripes, in_core->al_stripe_size_4k, + in_core->al_offset, in_core->bm_offset, in_core->md_size_sect, + (unsigned long long)in_core->la_size_sect, + (unsigned long long)capacity); + + return -EINVAL; +} + + /** * drbd_md_read() - Reads in the meta data super block * @mdev: DRBD device. * @bdev: Device from which the meta data should be read in. * - * Return 0 (NO_ERROR) on success, and an enum drbd_ret_code in case + * Return NO_ERROR on success, and an enum drbd_ret_code in case * something goes wrong. + * + * Called exactly once during drbd_adm_attach(), while still being D_DISKLESS, + * even before @bdev is assigned to @mdev->ldev. */ int drbd_md_read(struct drbd_conf *mdev, struct drbd_backing_dev *bdev) { @@ -2924,12 +3089,17 @@ int drbd_md_read(struct drbd_conf *mdev, struct drbd_backing_dev *bdev) u32 magic, flags; int i, rv = NO_ERROR; - if (!get_ldev_if_state(mdev, D_ATTACHING)) - return ERR_IO_MD_DISK; + if (mdev->state.disk != D_DISKLESS) + return ERR_DISK_CONFIGURED; buffer = drbd_md_get_buffer(mdev); if (!buffer) - goto out; + return ERR_NOMEM; + + /* First, figure out where our meta data superblock is located, + * and read it. */ + bdev->md.meta_dev_idx = bdev->disk_conf->meta_dev_idx; + bdev->md.md_offset = drbd_md_ss(bdev); if (drbd_md_sync_page_io(mdev, bdev, bdev->md.md_offset, READ)) { /* NOTE: can't do normal error processing here as this is @@ -2948,45 +3118,51 @@ int drbd_md_read(struct drbd_conf *mdev, struct drbd_backing_dev *bdev) rv = ERR_MD_UNCLEAN; goto err; } + + rv = ERR_MD_INVALID; if (magic != DRBD_MD_MAGIC_08) { if (magic == DRBD_MD_MAGIC_07) dev_err(DEV, "Found old (0.7) meta data magic. Did you \"drbdadm create-md\"?\n"); else dev_err(DEV, "Meta data magic not found. Did you \"drbdadm create-md\"?\n"); - rv = ERR_MD_INVALID; goto err; } - if (be32_to_cpu(buffer->al_offset) != bdev->md.al_offset) { - dev_err(DEV, "unexpected al_offset: %d (expected %d)\n", - be32_to_cpu(buffer->al_offset), bdev->md.al_offset); - rv = ERR_MD_INVALID; + + if (be32_to_cpu(buffer->bm_bytes_per_bit) != BM_BLOCK_SIZE) { + dev_err(DEV, "unexpected bm_bytes_per_bit: %u (expected %u)\n", + be32_to_cpu(buffer->bm_bytes_per_bit), BM_BLOCK_SIZE); goto err; } + + + /* convert to in_core endian */ + bdev->md.la_size_sect = be64_to_cpu(buffer->la_size_sect); + for (i = UI_CURRENT; i < UI_SIZE; i++) + bdev->md.uuid[i] = be64_to_cpu(buffer->uuid[i]); + bdev->md.flags = be32_to_cpu(buffer->flags); + bdev->md.device_uuid = be64_to_cpu(buffer->device_uuid); + + bdev->md.md_size_sect = be32_to_cpu(buffer->md_size_sect); + bdev->md.al_offset = be32_to_cpu(buffer->al_offset); + bdev->md.bm_offset = be32_to_cpu(buffer->bm_offset); + + if (check_activity_log_stripe_size(mdev, buffer, &bdev->md)) + goto err; + if (check_offsets_and_sizes(mdev, bdev)) + goto err; + if (be32_to_cpu(buffer->bm_offset) != bdev->md.bm_offset) { dev_err(DEV, "unexpected bm_offset: %d (expected %d)\n", be32_to_cpu(buffer->bm_offset), bdev->md.bm_offset); - rv = ERR_MD_INVALID; goto err; } if (be32_to_cpu(buffer->md_size_sect) != bdev->md.md_size_sect) { dev_err(DEV, "unexpected md_size: %u (expected %u)\n", be32_to_cpu(buffer->md_size_sect), bdev->md.md_size_sect); - rv = ERR_MD_INVALID; goto err; } - if (be32_to_cpu(buffer->bm_bytes_per_bit) != BM_BLOCK_SIZE) { - dev_err(DEV, "unexpected bm_bytes_per_bit: %u (expected %u)\n", - be32_to_cpu(buffer->bm_bytes_per_bit), BM_BLOCK_SIZE); - rv = ERR_MD_INVALID; - goto err; - } - - bdev->md.la_size_sect = be64_to_cpu(buffer->la_size); - for (i = UI_CURRENT; i < UI_SIZE; i++) - bdev->md.uuid[i] = be64_to_cpu(buffer->uuid[i]); - bdev->md.flags = be32_to_cpu(buffer->flags); - bdev->md.device_uuid = be64_to_cpu(buffer->device_uuid); + rv = NO_ERROR; spin_lock_irq(&mdev->tconn->req_lock); if (mdev->state.conn < C_CONNECTED) { @@ -2999,8 +3175,6 @@ int drbd_md_read(struct drbd_conf *mdev, struct drbd_backing_dev *bdev) err: drbd_md_put_buffer(mdev); - out: - put_ldev(mdev); return rv; } @@ -3238,8 +3412,12 @@ static int w_go_diskless(struct drbd_work *w, int unused) * end up here after a failed attach, before ldev was even assigned. */ if (mdev->bitmap && mdev->ldev) { + /* An interrupted resync or similar is allowed to recounts bits + * while we detach. + * Any modifications would not be expected anymore, though. + */ if (drbd_bitmap_io_from_worker(mdev, drbd_bm_write, - "detach", BM_LOCKED_MASK)) { + "detach", BM_LOCKED_TEST_ALLOWED)) { if (test_bit(WAS_READ_ERROR, &mdev->flags)) { drbd_md_set_flag(mdev, MDF_FULL_SYNC); drbd_md_sync(mdev); @@ -3251,13 +3429,6 @@ static int w_go_diskless(struct drbd_work *w, int unused) return 0; } -void drbd_go_diskless(struct drbd_conf *mdev) -{ - D_ASSERT(mdev->state.disk == D_FAILED); - if (!test_and_set_bit(GO_DISKLESS, &mdev->flags)) - drbd_queue_work(&mdev->tconn->sender_work, &mdev->go_diskless); -} - /** * drbd_queue_bitmap_io() - Queues an IO operation on the whole bitmap * @mdev: DRBD device. diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index 2af26fc95280..9e3f441e7e84 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -696,37 +696,52 @@ out: return 0; } -/* initializes the md.*_offset members, so we are able to find - * the on disk meta data */ +/* Initializes the md.*_offset members, so we are able to find + * the on disk meta data. + * + * We currently have two possible layouts: + * external: + * |----------- md_size_sect ------------------| + * [ 4k superblock ][ activity log ][ Bitmap ] + * | al_offset == 8 | + * | bm_offset = al_offset + X | + * ==> bitmap sectors = md_size_sect - bm_offset + * + * internal: + * |----------- md_size_sect ------------------| + * [data.....][ Bitmap ][ activity log ][ 4k superblock ] + * | al_offset < 0 | + * | bm_offset = al_offset - Y | + * ==> bitmap sectors = Y = al_offset - bm_offset + * + * Activity log size used to be fixed 32kB, + * but is about to become configurable. + */ static void drbd_md_set_sector_offsets(struct drbd_conf *mdev, struct drbd_backing_dev *bdev) { sector_t md_size_sect = 0; - int meta_dev_idx; + unsigned int al_size_sect = bdev->md.al_size_4k * 8; - rcu_read_lock(); - meta_dev_idx = rcu_dereference(bdev->disk_conf)->meta_dev_idx; + bdev->md.md_offset = drbd_md_ss(bdev); - switch (meta_dev_idx) { + switch (bdev->md.meta_dev_idx) { default: /* v07 style fixed size indexed meta data */ - bdev->md.md_size_sect = MD_RESERVED_SECT; - bdev->md.md_offset = drbd_md_ss__(mdev, bdev); - bdev->md.al_offset = MD_AL_OFFSET; - bdev->md.bm_offset = MD_BM_OFFSET; + bdev->md.md_size_sect = MD_128MB_SECT; + bdev->md.al_offset = MD_4kB_SECT; + bdev->md.bm_offset = MD_4kB_SECT + al_size_sect; break; case DRBD_MD_INDEX_FLEX_EXT: /* just occupy the full device; unit: sectors */ bdev->md.md_size_sect = drbd_get_capacity(bdev->md_bdev); - bdev->md.md_offset = 0; - bdev->md.al_offset = MD_AL_OFFSET; - bdev->md.bm_offset = MD_BM_OFFSET; + bdev->md.al_offset = MD_4kB_SECT; + bdev->md.bm_offset = MD_4kB_SECT + al_size_sect; break; case DRBD_MD_INDEX_INTERNAL: case DRBD_MD_INDEX_FLEX_INT: - bdev->md.md_offset = drbd_md_ss__(mdev, bdev); /* al size is still fixed */ - bdev->md.al_offset = -MD_AL_SECTORS; + bdev->md.al_offset = -al_size_sect; /* we need (slightly less than) ~ this much bitmap sectors: */ md_size_sect = drbd_get_capacity(bdev->backing_bdev); md_size_sect = ALIGN(md_size_sect, BM_SECT_PER_EXT); @@ -735,14 +750,13 @@ static void drbd_md_set_sector_offsets(struct drbd_conf *mdev, /* plus the "drbd meta data super block", * and the activity log; */ - md_size_sect += MD_BM_OFFSET; + md_size_sect += MD_4kB_SECT + al_size_sect; bdev->md.md_size_sect = md_size_sect; /* bitmap offset is adjusted by 'super' block size */ - bdev->md.bm_offset = -md_size_sect + MD_AL_OFFSET; + bdev->md.bm_offset = -md_size_sect + MD_4kB_SECT; break; } - rcu_read_unlock(); } /* input size is expected to be in KB */ @@ -805,7 +819,7 @@ void drbd_resume_io(struct drbd_conf *mdev) enum determine_dev_size drbd_determine_dev_size(struct drbd_conf *mdev, enum dds_flags flags) __must_hold(local) { sector_t prev_first_sect, prev_size; /* previous meta location */ - sector_t la_size, u_size; + sector_t la_size_sect, u_size; sector_t size; char ppb[10]; @@ -828,7 +842,7 @@ enum determine_dev_size drbd_determine_dev_size(struct drbd_conf *mdev, enum dds prev_first_sect = drbd_md_first_sector(mdev->ldev); prev_size = mdev->ldev->md.md_size_sect; - la_size = mdev->ldev->md.la_size_sect; + la_size_sect = mdev->ldev->md.la_size_sect; /* TODO: should only be some assert here, not (re)init... */ drbd_md_set_sector_offsets(mdev, mdev->ldev); @@ -864,7 +878,7 @@ enum determine_dev_size drbd_determine_dev_size(struct drbd_conf *mdev, enum dds if (rv == dev_size_error) goto out; - la_size_changed = (la_size != mdev->ldev->md.la_size_sect); + la_size_changed = (la_size_sect != mdev->ldev->md.la_size_sect); md_moved = prev_first_sect != drbd_md_first_sector(mdev->ldev) || prev_size != mdev->ldev->md.md_size_sect; @@ -886,9 +900,9 @@ enum determine_dev_size drbd_determine_dev_size(struct drbd_conf *mdev, enum dds drbd_md_mark_dirty(mdev); } - if (size > la_size) + if (size > la_size_sect) rv = grew; - if (size < la_size) + if (size < la_size_sect) rv = shrunk; out: lc_unlock(mdev->act_log); @@ -903,7 +917,7 @@ drbd_new_dev_size(struct drbd_conf *mdev, struct drbd_backing_dev *bdev, sector_t u_size, int assume_peer_has_space) { sector_t p_size = mdev->p_size; /* partner's disk size. */ - sector_t la_size = bdev->md.la_size_sect; /* last agreed size. */ + sector_t la_size_sect = bdev->md.la_size_sect; /* last agreed size. */ sector_t m_size; /* my size */ sector_t size = 0; @@ -917,8 +931,8 @@ drbd_new_dev_size(struct drbd_conf *mdev, struct drbd_backing_dev *bdev, if (p_size && m_size) { size = min_t(sector_t, p_size, m_size); } else { - if (la_size) { - size = la_size; + if (la_size_sect) { + size = la_size_sect; if (m_size && m_size < size) size = m_size; if (p_size && p_size < size) @@ -1127,15 +1141,32 @@ static bool should_set_defaults(struct genl_info *info) return 0 != (flags & DRBD_GENL_F_SET_DEFAULTS); } -static void enforce_disk_conf_limits(struct disk_conf *dc) +static unsigned int drbd_al_extents_max(struct drbd_backing_dev *bdev) { - if (dc->al_extents < DRBD_AL_EXTENTS_MIN) - dc->al_extents = DRBD_AL_EXTENTS_MIN; - if (dc->al_extents > DRBD_AL_EXTENTS_MAX) - dc->al_extents = DRBD_AL_EXTENTS_MAX; + /* This is limited by 16 bit "slot" numbers, + * and by available on-disk context storage. + * + * Also (u16)~0 is special (denotes a "free" extent). + * + * One transaction occupies one 4kB on-disk block, + * we have n such blocks in the on disk ring buffer, + * the "current" transaction may fail (n-1), + * and there is 919 slot numbers context information per transaction. + * + * 72 transaction blocks amounts to more than 2**16 context slots, + * so cap there first. + */ + const unsigned int max_al_nr = DRBD_AL_EXTENTS_MAX; + const unsigned int sufficient_on_disk = + (max_al_nr + AL_CONTEXT_PER_TRANSACTION -1) + /AL_CONTEXT_PER_TRANSACTION; + + unsigned int al_size_4k = bdev->md.al_size_4k; + + if (al_size_4k > sufficient_on_disk) + return max_al_nr; - if (dc->c_plan_ahead > DRBD_C_PLAN_AHEAD_MAX) - dc->c_plan_ahead = DRBD_C_PLAN_AHEAD_MAX; + return (al_size_4k - 1) * AL_CONTEXT_PER_TRANSACTION; } int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info) @@ -1182,7 +1213,13 @@ int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info) if (!expect(new_disk_conf->resync_rate >= 1)) new_disk_conf->resync_rate = 1; - enforce_disk_conf_limits(new_disk_conf); + if (new_disk_conf->al_extents < DRBD_AL_EXTENTS_MIN) + new_disk_conf->al_extents = DRBD_AL_EXTENTS_MIN; + if (new_disk_conf->al_extents > drbd_al_extents_max(mdev->ldev)) + new_disk_conf->al_extents = drbd_al_extents_max(mdev->ldev); + + if (new_disk_conf->c_plan_ahead > DRBD_C_PLAN_AHEAD_MAX) + new_disk_conf->c_plan_ahead = DRBD_C_PLAN_AHEAD_MAX; fifo_size = (new_disk_conf->c_plan_ahead * 10 * SLEEP_TIME) / HZ; if (fifo_size != mdev->rs_plan_s->size) { @@ -1330,7 +1367,8 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) goto fail; } - enforce_disk_conf_limits(new_disk_conf); + if (new_disk_conf->c_plan_ahead > DRBD_C_PLAN_AHEAD_MAX) + new_disk_conf->c_plan_ahead = DRBD_C_PLAN_AHEAD_MAX; new_plan = fifo_alloc((new_disk_conf->c_plan_ahead * 10 * SLEEP_TIME) / HZ); if (!new_plan) { @@ -1343,6 +1381,12 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) goto fail; } + write_lock_irq(&global_state_lock); + retcode = drbd_resync_after_valid(mdev, new_disk_conf->resync_after); + write_unlock_irq(&global_state_lock); + if (retcode != NO_ERROR) + goto fail; + rcu_read_lock(); nc = rcu_dereference(mdev->tconn->net_conf); if (nc) { @@ -1399,8 +1443,16 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) goto fail; } - /* RT - for drbd_get_max_capacity() DRBD_MD_INDEX_FLEX_INT */ - drbd_md_set_sector_offsets(mdev, nbc); + /* Read our meta data super block early. + * This also sets other on-disk offsets. */ + retcode = drbd_md_read(mdev, nbc); + if (retcode != NO_ERROR) + goto fail; + + if (new_disk_conf->al_extents < DRBD_AL_EXTENTS_MIN) + new_disk_conf->al_extents = DRBD_AL_EXTENTS_MIN; + if (new_disk_conf->al_extents > drbd_al_extents_max(nbc)) + new_disk_conf->al_extents = drbd_al_extents_max(nbc); if (drbd_get_max_capacity(nbc) < new_disk_conf->disk_size) { dev_err(DEV, "max capacity %llu smaller than disk size %llu\n", @@ -1416,7 +1468,7 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) min_md_device_sectors = (2<<10); } else { max_possible_sectors = DRBD_MAX_SECTORS; - min_md_device_sectors = MD_RESERVED_SECT * (new_disk_conf->meta_dev_idx + 1); + min_md_device_sectors = MD_128MB_SECT * (new_disk_conf->meta_dev_idx + 1); } if (drbd_get_capacity(nbc->md_bdev) < min_md_device_sectors) { @@ -1467,8 +1519,6 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) if (!get_ldev_if_state(mdev, D_ATTACHING)) goto force_diskless; - drbd_md_set_sector_offsets(mdev, nbc); - if (!mdev->bitmap) { if (drbd_bm_init(mdev)) { retcode = ERR_NOMEM; @@ -1476,10 +1526,6 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) } } - retcode = drbd_md_read(mdev, nbc); - if (retcode != NO_ERROR) - goto force_diskless_dec; - if (mdev->state.conn < C_CONNECTED && mdev->state.role == R_PRIMARY && (mdev->ed_uuid & ~((u64)1)) != (nbc->md.uuid[UI_CURRENT] & ~((u64)1))) { @@ -2158,8 +2204,11 @@ static enum drbd_state_rv conn_try_disconnect(struct drbd_tconn *tconn, bool for return SS_SUCCESS; case SS_PRIMARY_NOP: /* Our state checking code wants to see the peer outdated. */ - rv = conn_request_state(tconn, NS2(conn, C_DISCONNECTING, - pdsk, D_OUTDATED), CS_VERBOSE); + rv = conn_request_state(tconn, NS2(conn, C_DISCONNECTING, pdsk, D_OUTDATED), 0); + + if (rv == SS_OUTDATE_WO_CONN) /* lost connection before graceful disconnect succeeded */ + rv = conn_request_state(tconn, NS(conn, C_DISCONNECTING), CS_VERBOSE); + break; case SS_CW_FAILED_BY_PEER: /* The peer probably wants to see us outdated. */ @@ -2406,22 +2455,19 @@ int drbd_adm_invalidate(struct sk_buff *skb, struct genl_info *info) wait_event(mdev->misc_wait, !test_bit(BITMAP_IO, &mdev->flags)); drbd_flush_workqueue(mdev); - retcode = _drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_T), CS_ORDERED); - - if (retcode < SS_SUCCESS && retcode != SS_NEED_CONNECTION) - retcode = drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_T)); - - while (retcode == SS_NEED_CONNECTION) { - spin_lock_irq(&mdev->tconn->req_lock); - if (mdev->state.conn < C_CONNECTED) - retcode = _drbd_set_state(_NS(mdev, disk, D_INCONSISTENT), CS_VERBOSE, NULL); - spin_unlock_irq(&mdev->tconn->req_lock); - - if (retcode != SS_NEED_CONNECTION) - break; - + /* If we happen to be C_STANDALONE R_SECONDARY, just change to + * D_INCONSISTENT, and set all bits in the bitmap. Otherwise, + * try to start a resync handshake as sync target for full sync. + */ + if (mdev->state.conn == C_STANDALONE && mdev->state.role == R_SECONDARY) { + retcode = drbd_request_state(mdev, NS(disk, D_INCONSISTENT)); + if (retcode >= SS_SUCCESS) { + if (drbd_bitmap_io(mdev, &drbd_bmio_set_n_write, + "set_n_write from invalidate", BM_LOCKED_MASK)) + retcode = ERR_IO_MD_DISK; + } + } else retcode = drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_T)); - } drbd_resume_io(mdev); out: @@ -2475,21 +2521,22 @@ int drbd_adm_invalidate_peer(struct sk_buff *skb, struct genl_info *info) wait_event(mdev->misc_wait, !test_bit(BITMAP_IO, &mdev->flags)); drbd_flush_workqueue(mdev); - retcode = _drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_S), CS_ORDERED); - if (retcode < SS_SUCCESS) { - if (retcode == SS_NEED_CONNECTION && mdev->state.role == R_PRIMARY) { - /* The peer will get a resync upon connect anyways. - * Just make that into a full resync. */ - retcode = drbd_request_state(mdev, NS(pdsk, D_INCONSISTENT)); - if (retcode >= SS_SUCCESS) { - if (drbd_bitmap_io(mdev, &drbd_bmio_set_susp_al, - "set_n_write from invalidate_peer", - BM_LOCKED_SET_ALLOWED)) - retcode = ERR_IO_MD_DISK; - } - } else - retcode = drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_S)); - } + /* If we happen to be C_STANDALONE R_PRIMARY, just set all bits + * in the bitmap. Otherwise, try to start a resync handshake + * as sync source for full sync. + */ + if (mdev->state.conn == C_STANDALONE && mdev->state.role == R_PRIMARY) { + /* The peer will get a resync upon connect anyways. Just make that + into a full resync. */ + retcode = drbd_request_state(mdev, NS(pdsk, D_INCONSISTENT)); + if (retcode >= SS_SUCCESS) { + if (drbd_bitmap_io(mdev, &drbd_bmio_set_susp_al, + "set_n_write from invalidate_peer", + BM_LOCKED_SET_ALLOWED)) + retcode = ERR_IO_MD_DISK; + } + } else + retcode = drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_S)); drbd_resume_io(mdev); out: @@ -3162,6 +3209,7 @@ static enum drbd_ret_code adm_delete_minor(struct drbd_conf *mdev) CS_VERBOSE + CS_WAIT_COMPLETE); idr_remove(&mdev->tconn->volumes, mdev->vnr); idr_remove(&minors, mdev_to_minor(mdev)); + destroy_workqueue(mdev->submit.wq); del_gendisk(mdev->vdisk); synchronize_rcu(); kref_put(&mdev->kref, &drbd_minor_destroy); diff --git a/drivers/block/drbd/drbd_proc.c b/drivers/block/drbd/drbd_proc.c index 928adb815b09..bf31d41dbaad 100644 --- a/drivers/block/drbd/drbd_proc.c +++ b/drivers/block/drbd/drbd_proc.c @@ -313,8 +313,14 @@ static int drbd_seq_show(struct seq_file *seq, void *v) static int drbd_proc_open(struct inode *inode, struct file *file) { - if (try_module_get(THIS_MODULE)) - return single_open(file, drbd_seq_show, PDE_DATA(inode)); + int err; + + if (try_module_get(THIS_MODULE)) { + err = single_open(file, drbd_seq_show, PDE_DATA(inode)); + if (err) + module_put(THIS_MODULE); + return err; + } return -ENODEV; } diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 83c5ae0ed56b..4222affff488 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -850,6 +850,7 @@ int drbd_connected(struct drbd_conf *mdev) err = drbd_send_current_state(mdev); clear_bit(USE_DEGR_WFC_T, &mdev->flags); clear_bit(RESIZE_PENDING, &mdev->flags); + atomic_set(&mdev->ap_in_flight, 0); mod_timer(&mdev->request_timer, jiffies + HZ); /* just start it here. */ return err; } @@ -2266,7 +2267,7 @@ static int receive_Data(struct drbd_tconn *tconn, struct packet_info *pi) drbd_set_out_of_sync(mdev, peer_req->i.sector, peer_req->i.size); peer_req->flags |= EE_CALL_AL_COMPLETE_IO; peer_req->flags &= ~EE_MAY_SET_IN_SYNC; - drbd_al_begin_io(mdev, &peer_req->i); + drbd_al_begin_io(mdev, &peer_req->i, true); } err = drbd_submit_peer_request(mdev, peer_req, rw, DRBD_FAULT_DT_WR); @@ -2662,7 +2663,6 @@ static int drbd_asb_recover_1p(struct drbd_conf *mdev) __must_hold(local) if (hg == -1 && mdev->state.role == R_PRIMARY) { enum drbd_state_rv rv2; - drbd_set_role(mdev, R_SECONDARY, 0); /* drbd_change_state() does not sleep while in SS_IN_TRANSIENT_STATE, * we might be here in C_WF_REPORT_PARAMS which is transient. * we do not need to wait for the after state change work either. */ @@ -3993,7 +3993,7 @@ static int receive_state(struct drbd_tconn *tconn, struct packet_info *pi) clear_bit(DISCARD_MY_DATA, &mdev->flags); - drbd_md_sync(mdev); /* update connected indicator, la_size, ... */ + drbd_md_sync(mdev); /* update connected indicator, la_size_sect, ... */ return 0; } @@ -4660,8 +4660,8 @@ static int drbd_do_features(struct drbd_tconn *tconn) #if !defined(CONFIG_CRYPTO_HMAC) && !defined(CONFIG_CRYPTO_HMAC_MODULE) static int drbd_do_auth(struct drbd_tconn *tconn) { - dev_err(DEV, "This kernel was build without CONFIG_CRYPTO_HMAC.\n"); - dev_err(DEV, "You need to disable 'cram-hmac-alg' in drbd.conf.\n"); + conn_err(tconn, "This kernel was build without CONFIG_CRYPTO_HMAC.\n"); + conn_err(tconn, "You need to disable 'cram-hmac-alg' in drbd.conf.\n"); return -1; } #else @@ -5258,9 +5258,11 @@ int drbd_asender(struct drbd_thread *thi) bool ping_timeout_active = false; struct net_conf *nc; int ping_timeo, tcp_cork, ping_int; + struct sched_param param = { .sched_priority = 2 }; - current->policy = SCHED_RR; /* Make this a realtime task! */ - current->rt_priority = 2; /* more important than all other tasks */ + rv = sched_setscheduler(current, SCHED_RR, ¶m); + if (rv < 0) + conn_err(tconn, "drbd_asender: ERROR set priority, ret=%d\n", rv); while (get_t_state(thi) == RUNNING) { drbd_thread_current_set_cpu(thi); diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c index 2b8303ad63c9..c24379ffd4e3 100644 --- a/drivers/block/drbd/drbd_req.c +++ b/drivers/block/drbd/drbd_req.c @@ -34,14 +34,14 @@ static bool drbd_may_do_local_read(struct drbd_conf *mdev, sector_t sector, int size); /* Update disk stats at start of I/O request */ -static void _drbd_start_io_acct(struct drbd_conf *mdev, struct drbd_request *req, struct bio *bio) +static void _drbd_start_io_acct(struct drbd_conf *mdev, struct drbd_request *req) { - const int rw = bio_data_dir(bio); + const int rw = bio_data_dir(req->master_bio); int cpu; cpu = part_stat_lock(); part_round_stats(cpu, &mdev->vdisk->part0); part_stat_inc(cpu, &mdev->vdisk->part0, ios[rw]); - part_stat_add(cpu, &mdev->vdisk->part0, sectors[rw], bio_sectors(bio)); + part_stat_add(cpu, &mdev->vdisk->part0, sectors[rw], req->i.size >> 9); (void) cpu; /* The macro invocations above want the cpu argument, I do not like the compiler warning about cpu only assigned but never used... */ part_inc_in_flight(&mdev->vdisk->part0, rw); @@ -263,8 +263,7 @@ void drbd_req_complete(struct drbd_request *req, struct bio_and_error *m) else root = &mdev->read_requests; drbd_remove_request_interval(root, req); - } else if (!(s & RQ_POSTPONED)) - D_ASSERT((s & (RQ_NET_MASK & ~RQ_NET_DONE)) == 0); + } /* Before we can signal completion to the upper layers, * we may need to close the current transfer log epoch. @@ -755,6 +754,11 @@ int __req_mod(struct drbd_request *req, enum drbd_req_event what, D_ASSERT(req->rq_state & RQ_NET_PENDING); mod_rq_state(req, m, RQ_NET_PENDING, RQ_NET_OK|RQ_NET_DONE); break; + + case QUEUE_AS_DRBD_BARRIER: + start_new_tl_epoch(mdev->tconn); + mod_rq_state(req, m, 0, RQ_NET_OK|RQ_NET_DONE); + break; }; return rv; @@ -861,8 +865,10 @@ static void maybe_pull_ahead(struct drbd_conf *mdev) bool congested = false; enum drbd_on_congestion on_congestion; + rcu_read_lock(); nc = rcu_dereference(tconn->net_conf); on_congestion = nc ? nc->on_congestion : OC_BLOCK; + rcu_read_unlock(); if (on_congestion == OC_BLOCK || tconn->agreed_pro_version < 96) return; @@ -956,14 +962,8 @@ static int drbd_process_write_request(struct drbd_request *req) struct drbd_conf *mdev = req->w.mdev; int remote, send_oos; - rcu_read_lock(); remote = drbd_should_do_remote(mdev->state); - if (remote) { - maybe_pull_ahead(mdev); - remote = drbd_should_do_remote(mdev->state); - } send_oos = drbd_should_send_out_of_sync(mdev->state); - rcu_read_unlock(); /* Need to replicate writes. Unless it is an empty flush, * which is better mapped to a DRBD P_BARRIER packet, @@ -975,8 +975,8 @@ static int drbd_process_write_request(struct drbd_request *req) /* The only size==0 bios we expect are empty flushes. */ D_ASSERT(req->master_bio->bi_rw & REQ_FLUSH); if (remote) - start_new_tl_epoch(mdev->tconn); - return 0; + _req_mod(req, QUEUE_AS_DRBD_BARRIER); + return remote; } if (!remote && !send_oos) @@ -1020,12 +1020,24 @@ drbd_submit_req_private_bio(struct drbd_request *req) bio_endio(bio, -EIO); } -void __drbd_make_request(struct drbd_conf *mdev, struct bio *bio, unsigned long start_time) +static void drbd_queue_write(struct drbd_conf *mdev, struct drbd_request *req) { - const int rw = bio_rw(bio); - struct bio_and_error m = { NULL, }; + spin_lock(&mdev->submit.lock); + list_add_tail(&req->tl_requests, &mdev->submit.writes); + spin_unlock(&mdev->submit.lock); + queue_work(mdev->submit.wq, &mdev->submit.worker); +} + +/* returns the new drbd_request pointer, if the caller is expected to + * drbd_send_and_submit() it (to save latency), or NULL if we queued the + * request on the submitter thread. + * Returns ERR_PTR(-ENOMEM) if we cannot allocate a drbd_request. + */ +struct drbd_request * +drbd_request_prepare(struct drbd_conf *mdev, struct bio *bio, unsigned long start_time) +{ + const int rw = bio_data_dir(bio); struct drbd_request *req; - bool no_remote = false; /* allocate outside of all locks; */ req = drbd_req_new(mdev, bio); @@ -1035,7 +1047,7 @@ void __drbd_make_request(struct drbd_conf *mdev, struct bio *bio, unsigned long * if user cannot handle io errors, that's not our business. */ dev_err(DEV, "could not kmalloc() req\n"); bio_endio(bio, -ENOMEM); - return; + return ERR_PTR(-ENOMEM); } req->start_time = start_time; @@ -1044,28 +1056,40 @@ void __drbd_make_request(struct drbd_conf *mdev, struct bio *bio, unsigned long req->private_bio = NULL; } - /* For WRITES going to the local disk, grab a reference on the target - * extent. This waits for any resync activity in the corresponding - * resync extent to finish, and, if necessary, pulls in the target - * extent into the activity log, which involves further disk io because - * of transactional on-disk meta data updates. - * Empty flushes don't need to go into the activity log, they can only - * flush data for pending writes which are already in there. */ + /* Update disk stats */ + _drbd_start_io_acct(mdev, req); + if (rw == WRITE && req->private_bio && req->i.size && !test_bit(AL_SUSPENDED, &mdev->flags)) { + if (!drbd_al_begin_io_fastpath(mdev, &req->i)) { + drbd_queue_write(mdev, req); + return NULL; + } req->rq_state |= RQ_IN_ACT_LOG; - drbd_al_begin_io(mdev, &req->i); } + return req; +} + +static void drbd_send_and_submit(struct drbd_conf *mdev, struct drbd_request *req) +{ + const int rw = bio_rw(req->master_bio); + struct bio_and_error m = { NULL, }; + bool no_remote = false; + spin_lock_irq(&mdev->tconn->req_lock); if (rw == WRITE) { /* This may temporarily give up the req_lock, * but will re-aquire it before it returns here. * Needs to be before the check on drbd_suspended() */ complete_conflicting_writes(req); + /* no more giving up req_lock from now on! */ + + /* check for congestion, and potentially stop sending + * full data updates, but start sending "dirty bits" only. */ + maybe_pull_ahead(mdev); } - /* no more giving up req_lock from now on! */ if (drbd_suspended(mdev)) { /* push back and retry: */ @@ -1078,9 +1102,6 @@ void __drbd_make_request(struct drbd_conf *mdev, struct bio *bio, unsigned long goto out; } - /* Update disk stats */ - _drbd_start_io_acct(mdev, req, bio); - /* We fail READ/READA early, if we can not serve it. * We must do this before req is registered on any lists. * Otherwise, drbd_req_complete() will queue failed READ for retry. */ @@ -1137,7 +1158,116 @@ out: if (m.bio) complete_master_bio(mdev, &m); - return; +} + +void __drbd_make_request(struct drbd_conf *mdev, struct bio *bio, unsigned long start_time) +{ + struct drbd_request *req = drbd_request_prepare(mdev, bio, start_time); + if (IS_ERR_OR_NULL(req)) + return; + drbd_send_and_submit(mdev, req); +} + +static void submit_fast_path(struct drbd_conf *mdev, struct list_head *incoming) +{ + struct drbd_request *req, *tmp; + list_for_each_entry_safe(req, tmp, incoming, tl_requests) { + const int rw = bio_data_dir(req->master_bio); + + if (rw == WRITE /* rw != WRITE should not even end up here! */ + && req->private_bio && req->i.size + && !test_bit(AL_SUSPENDED, &mdev->flags)) { + if (!drbd_al_begin_io_fastpath(mdev, &req->i)) + continue; + + req->rq_state |= RQ_IN_ACT_LOG; + } + + list_del_init(&req->tl_requests); + drbd_send_and_submit(mdev, req); + } +} + +static bool prepare_al_transaction_nonblock(struct drbd_conf *mdev, + struct list_head *incoming, + struct list_head *pending) +{ + struct drbd_request *req, *tmp; + int wake = 0; + int err; + + spin_lock_irq(&mdev->al_lock); + list_for_each_entry_safe(req, tmp, incoming, tl_requests) { + err = drbd_al_begin_io_nonblock(mdev, &req->i); + if (err == -EBUSY) + wake = 1; + if (err) + continue; + req->rq_state |= RQ_IN_ACT_LOG; + list_move_tail(&req->tl_requests, pending); + } + spin_unlock_irq(&mdev->al_lock); + if (wake) + wake_up(&mdev->al_wait); + + return !list_empty(pending); +} + +void do_submit(struct work_struct *ws) +{ + struct drbd_conf *mdev = container_of(ws, struct drbd_conf, submit.worker); + LIST_HEAD(incoming); + LIST_HEAD(pending); + struct drbd_request *req, *tmp; + + for (;;) { + spin_lock(&mdev->submit.lock); + list_splice_tail_init(&mdev->submit.writes, &incoming); + spin_unlock(&mdev->submit.lock); + + submit_fast_path(mdev, &incoming); + if (list_empty(&incoming)) + break; + + wait_event(mdev->al_wait, prepare_al_transaction_nonblock(mdev, &incoming, &pending)); + /* Maybe more was queued, while we prepared the transaction? + * Try to stuff them into this transaction as well. + * Be strictly non-blocking here, no wait_event, we already + * have something to commit. + * Stop if we don't make any more progres. + */ + for (;;) { + LIST_HEAD(more_pending); + LIST_HEAD(more_incoming); + bool made_progress; + + /* It is ok to look outside the lock, + * it's only an optimization anyways */ + if (list_empty(&mdev->submit.writes)) + break; + + spin_lock(&mdev->submit.lock); + list_splice_tail_init(&mdev->submit.writes, &more_incoming); + spin_unlock(&mdev->submit.lock); + + if (list_empty(&more_incoming)) + break; + + made_progress = prepare_al_transaction_nonblock(mdev, &more_incoming, &more_pending); + + list_splice_tail_init(&more_pending, &pending); + list_splice_tail_init(&more_incoming, &incoming); + + if (!made_progress) + break; + } + drbd_al_begin_io_commit(mdev, false); + + list_for_each_entry_safe(req, tmp, &pending, tl_requests) { + list_del_init(&req->tl_requests); + drbd_send_and_submit(mdev, req); + } + } } void drbd_make_request(struct request_queue *q, struct bio *bio) diff --git a/drivers/block/drbd/drbd_req.h b/drivers/block/drbd/drbd_req.h index c08d22964d06..978cb1addc98 100644 --- a/drivers/block/drbd/drbd_req.h +++ b/drivers/block/drbd/drbd_req.h @@ -88,6 +88,14 @@ enum drbd_req_event { QUEUE_FOR_NET_READ, QUEUE_FOR_SEND_OOS, + /* An empty flush is queued as P_BARRIER, + * which will cause it to complete "successfully", + * even if the local disk flush failed. + * + * Just like "real" requests, empty flushes (blkdev_issue_flush()) will + * only see an error if neither local nor remote data is reachable. */ + QUEUE_AS_DRBD_BARRIER, + SEND_CANCELED, SEND_FAILED, HANDED_OVER_TO_NETWORK, diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c index 0fe220cfb9e9..90c5be2b1d30 100644 --- a/drivers/block/drbd/drbd_state.c +++ b/drivers/block/drbd/drbd_state.c @@ -570,6 +570,13 @@ is_valid_state(struct drbd_conf *mdev, union drbd_state ns) mdev->tconn->agreed_pro_version < 88) rv = SS_NOT_SUPPORTED; + else if (ns.role == R_PRIMARY && ns.disk < D_UP_TO_DATE && ns.pdsk < D_UP_TO_DATE) + rv = SS_NO_UP_TO_DATE_DISK; + + else if ((ns.conn == C_STARTING_SYNC_S || ns.conn == C_STARTING_SYNC_T) && + ns.pdsk == D_UNKNOWN) + rv = SS_NEED_CONNECTION; + else if (ns.conn >= C_CONNECTED && ns.pdsk == D_UNKNOWN) rv = SS_CONNECTED_OUTDATES; @@ -635,6 +642,10 @@ is_valid_soft_transition(union drbd_state os, union drbd_state ns, struct drbd_t && os.conn < C_WF_REPORT_PARAMS) rv = SS_NEED_CONNECTION; /* No NetworkFailure -> SyncTarget etc... */ + if (ns.conn == C_DISCONNECTING && ns.pdsk == D_OUTDATED && + os.conn < C_CONNECTED && os.pdsk > D_OUTDATED) + rv = SS_OUTDATE_WO_CONN; + return rv; } @@ -1377,13 +1388,6 @@ static void after_state_ch(struct drbd_conf *mdev, union drbd_state os, &drbd_bmio_set_n_write, &abw_start_sync, "set_n_write from StartingSync", BM_LOCKED_TEST_ALLOWED); - /* We are invalidating our self... */ - if (os.conn < C_CONNECTED && ns.conn < C_CONNECTED && - os.disk > D_INCONSISTENT && ns.disk == D_INCONSISTENT) - /* other bitmap operation expected during this phase */ - drbd_queue_bitmap_io(mdev, &drbd_bmio_set_n_write, NULL, - "set_n_write from invalidate", BM_LOCKED_MASK); - /* first half of local IO error, failure to attach, * or administrative detach */ if (os.disk != D_FAILED && ns.disk == D_FAILED) { @@ -1748,13 +1752,9 @@ _conn_rq_cond(struct drbd_tconn *tconn, union drbd_state mask, union drbd_state if (test_and_clear_bit(CONN_WD_ST_CHG_FAIL, &tconn->flags)) return SS_CW_FAILED_BY_PEER; - rv = tconn->cstate != C_WF_REPORT_PARAMS ? SS_CW_NO_NEED : SS_UNKNOWN_ERROR; - - if (rv == SS_UNKNOWN_ERROR) - rv = conn_is_valid_transition(tconn, mask, val, 0); - - if (rv == SS_SUCCESS) - rv = SS_UNKNOWN_ERROR; /* cont waiting, otherwise fail. */ + rv = conn_is_valid_transition(tconn, mask, val, 0); + if (rv == SS_SUCCESS && tconn->cstate == C_WF_REPORT_PARAMS) + rv = SS_UNKNOWN_ERROR; /* continue waiting */ return rv; } diff --git a/drivers/block/drbd/drbd_strings.c b/drivers/block/drbd/drbd_strings.c index 9a664bd27404..58e08ff2b2ce 100644 --- a/drivers/block/drbd/drbd_strings.c +++ b/drivers/block/drbd/drbd_strings.c @@ -89,6 +89,7 @@ static const char *drbd_state_sw_errors[] = { [-SS_LOWER_THAN_OUTDATED] = "Disk state is lower than outdated", [-SS_IN_TRANSIENT_STATE] = "In transient state, retry after next state change", [-SS_CONCURRENT_ST_CHG] = "Concurrent state changes detected and aborted", + [-SS_OUTDATE_WO_CONN] = "Need a connection for a graceful disconnect/outdate peer", [-SS_O_VOL_PEER_PRI] = "Other vol primary on peer not allowed by config", }; diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c index 424dc7bdf9b7..891c0ecaa292 100644 --- a/drivers/block/drbd/drbd_worker.c +++ b/drivers/block/drbd/drbd_worker.c @@ -89,7 +89,8 @@ void drbd_md_io_complete(struct bio *bio, int error) md_io->done = 1; wake_up(&mdev->misc_wait); bio_put(bio); - put_ldev(mdev); + if (mdev->ldev) /* special case: drbd_md_read() during drbd_adm_attach() */ + put_ldev(mdev); } /* reads on behalf of the partner, @@ -1410,7 +1411,7 @@ int w_restart_disk_io(struct drbd_work *w, int cancel) struct drbd_conf *mdev = w->mdev; if (bio_data_dir(req->master_bio) == WRITE && req->rq_state & RQ_IN_ACT_LOG) - drbd_al_begin_io(mdev, &req->i); + drbd_al_begin_io(mdev, &req->i, false); drbd_req_make_private_bio(req, req->master_bio); req->private_bio->bi_bdev = mdev->ldev->backing_bdev; @@ -1425,7 +1426,7 @@ static int _drbd_may_sync_now(struct drbd_conf *mdev) int resync_after; while (1) { - if (!odev->ldev) + if (!odev->ldev || odev->state.disk == D_DISKLESS) return 1; rcu_read_lock(); resync_after = rcu_dereference(odev->ldev->disk_conf)->resync_after; @@ -1433,7 +1434,7 @@ static int _drbd_may_sync_now(struct drbd_conf *mdev) if (resync_after == -1) return 1; odev = minor_to_mdev(resync_after); - if (!expect(odev)) + if (!odev) return 1; if ((odev->state.conn >= C_SYNC_SOURCE && odev->state.conn <= C_PAUSED_SYNC_T) || @@ -1515,7 +1516,7 @@ enum drbd_ret_code drbd_resync_after_valid(struct drbd_conf *mdev, int o_minor) if (o_minor == -1) return NO_ERROR; - if (o_minor < -1 || minor_to_mdev(o_minor) == NULL) + if (o_minor < -1 || o_minor > MINORMASK) return ERR_RESYNC_AFTER; /* check for loops */ @@ -1524,6 +1525,15 @@ enum drbd_ret_code drbd_resync_after_valid(struct drbd_conf *mdev, int o_minor) if (odev == mdev) return ERR_RESYNC_AFTER_CYCLE; + /* You are free to depend on diskless, non-existing, + * or not yet/no longer existing minors. + * We only reject dependency loops. + * We cannot follow the dependency chain beyond a detached or + * missing minor. + */ + if (!odev || !odev->ldev || odev->state.disk == D_DISKLESS) + return NO_ERROR; + rcu_read_lock(); resync_after = rcu_dereference(odev->ldev->disk_conf)->resync_after; rcu_read_unlock(); @@ -1652,7 +1662,9 @@ void drbd_start_resync(struct drbd_conf *mdev, enum drbd_conns side) clear_bit(B_RS_H_DONE, &mdev->flags); write_lock_irq(&global_state_lock); - if (!get_ldev_if_state(mdev, D_NEGOTIATING)) { + /* Did some connection breakage or IO error race with us? */ + if (mdev->state.conn < C_CONNECTED + || !get_ldev_if_state(mdev, D_NEGOTIATING)) { write_unlock_irq(&global_state_lock); mutex_unlock(mdev->state_mutex); return; diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index c49e85608101..04ceb7e2fadd 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -3775,7 +3775,6 @@ static int __floppy_read_block_0(struct block_device *bdev) bio_vec.bv_len = size; bio_vec.bv_offset = 0; bio.bi_vcnt = 1; - bio.bi_idx = 0; bio.bi_size = size; bio.bi_bdev = bdev; bio.bi_sector = 0; diff --git a/drivers/block/mg_disk.c b/drivers/block/mg_disk.c index 076ae7f1b781..a56cfcd5d648 100644 --- a/drivers/block/mg_disk.c +++ b/drivers/block/mg_disk.c @@ -780,6 +780,7 @@ static const struct block_device_operations mg_disk_ops = { .getgeo = mg_getgeo }; +#ifdef CONFIG_PM_SLEEP static int mg_suspend(struct device *dev) { struct mg_drv_data *prv_data = dev->platform_data; @@ -824,6 +825,7 @@ static int mg_resume(struct device *dev) return 0; } +#endif static SIMPLE_DEV_PM_OPS(mg_pm, mg_suspend, mg_resume); diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 32c678028e53..847107ef0cce 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -728,7 +728,10 @@ static void mtip_async_complete(struct mtip_port *port, atomic_set(&port->commands[tag].active, 0); release_slot(port, tag); - up(&port->cmd_slot); + if (unlikely(command->unaligned)) + up(&port->cmd_slot_unal); + else + up(&port->cmd_slot); } /* @@ -1560,10 +1563,12 @@ static int mtip_get_identify(struct mtip_port *port, void __user *user_buffer) } #endif +#ifdef MTIP_TRIM /* Disabling TRIM support temporarily */ /* Demux ID.DRAT & ID.RZAT to determine trim support */ if (port->identify[69] & (1 << 14) && port->identify[69] & (1 << 5)) port->dd->trim_supp = true; else +#endif port->dd->trim_supp = false; /* Set the identify buffer as valid. */ @@ -2557,7 +2562,7 @@ static int mtip_hw_ioctl(struct driver_data *dd, unsigned int cmd, */ static void mtip_hw_submit_io(struct driver_data *dd, sector_t sector, int nsect, int nents, int tag, void *callback, - void *data, int dir) + void *data, int dir, int unaligned) { struct host_to_dev_fis *fis; struct mtip_port *port = dd->port; @@ -2570,6 +2575,7 @@ static void mtip_hw_submit_io(struct driver_data *dd, sector_t sector, command->scatter_ents = nents; + command->unaligned = unaligned; /* * The number of retries for this command before it is * reported as a failure to the upper layers. @@ -2598,6 +2604,9 @@ static void mtip_hw_submit_io(struct driver_data *dd, sector_t sector, fis->res3 = 0; fill_command_sg(dd, command, nents); + if (unaligned) + fis->device |= 1 << 7; + /* Populate the command header */ command->command_header->opts = __force_bit2int cpu_to_le32( @@ -2644,9 +2653,13 @@ static void mtip_hw_submit_io(struct driver_data *dd, sector_t sector, * return value * None */ -static void mtip_hw_release_scatterlist(struct driver_data *dd, int tag) +static void mtip_hw_release_scatterlist(struct driver_data *dd, int tag, + int unaligned) { + struct semaphore *sem = unaligned ? &dd->port->cmd_slot_unal : + &dd->port->cmd_slot; release_slot(dd->port, tag); + up(sem); } /* @@ -2661,22 +2674,25 @@ static void mtip_hw_release_scatterlist(struct driver_data *dd, int tag) * or NULL if no command slots are available. */ static struct scatterlist *mtip_hw_get_scatterlist(struct driver_data *dd, - int *tag) + int *tag, int unaligned) { + struct semaphore *sem = unaligned ? &dd->port->cmd_slot_unal : + &dd->port->cmd_slot; + /* * It is possible that, even with this semaphore, a thread * may think that no command slots are available. Therefore, we * need to make an attempt to get_slot(). */ - down(&dd->port->cmd_slot); + down(sem); *tag = get_slot(dd->port); if (unlikely(test_bit(MTIP_DDF_REMOVE_PENDING_BIT, &dd->dd_flag))) { - up(&dd->port->cmd_slot); + up(sem); return NULL; } if (unlikely(*tag < 0)) { - up(&dd->port->cmd_slot); + up(sem); return NULL; } @@ -3010,6 +3026,11 @@ static inline void hba_setup(struct driver_data *dd) dd->mmio + HOST_HSORG); } +static int mtip_device_unaligned_constrained(struct driver_data *dd) +{ + return (dd->pdev->device == P420M_DEVICE_ID ? 1 : 0); +} + /* * Detect the details of the product, and store anything needed * into the driver data structure. This includes product type and @@ -3232,8 +3253,15 @@ static int mtip_hw_init(struct driver_data *dd) for (i = 0; i < MTIP_MAX_SLOT_GROUPS; i++) dd->work[i].port = dd->port; + /* Enable unaligned IO constraints for some devices */ + if (mtip_device_unaligned_constrained(dd)) + dd->unal_qdepth = MTIP_MAX_UNALIGNED_SLOTS; + else + dd->unal_qdepth = 0; + /* Counting semaphore to track command slot usage */ - sema_init(&dd->port->cmd_slot, num_command_slots - 1); + sema_init(&dd->port->cmd_slot, num_command_slots - 1 - dd->unal_qdepth); + sema_init(&dd->port->cmd_slot_unal, dd->unal_qdepth); /* Spinlock to prevent concurrent issue */ for (i = 0; i < MTIP_MAX_SLOT_GROUPS; i++) @@ -3836,7 +3864,7 @@ static void mtip_make_request(struct request_queue *queue, struct bio *bio) struct scatterlist *sg; struct bio_vec *bvec; int nents = 0; - int tag = 0; + int tag = 0, unaligned = 0; if (unlikely(dd->dd_flag & MTIP_DDF_STOP_IO)) { if (unlikely(test_bit(MTIP_DDF_REMOVE_PENDING_BIT, @@ -3872,7 +3900,15 @@ static void mtip_make_request(struct request_queue *queue, struct bio *bio) return; } - sg = mtip_hw_get_scatterlist(dd, &tag); + if (bio_data_dir(bio) == WRITE && bio_sectors(bio) <= 64 && + dd->unal_qdepth) { + if (bio->bi_sector % 8 != 0) /* Unaligned on 4k boundaries */ + unaligned = 1; + else if (bio_sectors(bio) % 8 != 0) /* Aligned but not 4k/8k */ + unaligned = 1; + } + + sg = mtip_hw_get_scatterlist(dd, &tag, unaligned); if (likely(sg != NULL)) { blk_queue_bounce(queue, &bio); @@ -3880,7 +3916,7 @@ static void mtip_make_request(struct request_queue *queue, struct bio *bio) dev_warn(&dd->pdev->dev, "Maximum number of SGL entries exceeded\n"); bio_io_error(bio); - mtip_hw_release_scatterlist(dd, tag); + mtip_hw_release_scatterlist(dd, tag, unaligned); return; } @@ -3900,7 +3936,8 @@ static void mtip_make_request(struct request_queue *queue, struct bio *bio) tag, bio_endio, bio, - bio_data_dir(bio)); + bio_data_dir(bio), + unaligned); } else bio_io_error(bio); } @@ -4156,26 +4193,24 @@ static int mtip_block_remove(struct driver_data *dd) */ static int mtip_block_shutdown(struct driver_data *dd) { - dev_info(&dd->pdev->dev, - "Shutting down %s ...\n", dd->disk->disk_name); - /* Delete our gendisk structure, and cleanup the blk queue. */ if (dd->disk) { - if (dd->disk->queue) + dev_info(&dd->pdev->dev, + "Shutting down %s ...\n", dd->disk->disk_name); + + if (dd->disk->queue) { del_gendisk(dd->disk); - else + blk_cleanup_queue(dd->queue); + } else put_disk(dd->disk); + dd->disk = NULL; + dd->queue = NULL; } - spin_lock(&rssd_index_lock); ida_remove(&rssd_index_ida, dd->index); spin_unlock(&rssd_index_lock); - blk_cleanup_queue(dd->queue); - dd->disk = NULL; - dd->queue = NULL; - mtip_hw_shutdown(dd); return 0; } diff --git a/drivers/block/mtip32xx/mtip32xx.h b/drivers/block/mtip32xx/mtip32xx.h index 8e8334c9dd0f..3bb8a295fbe4 100644 --- a/drivers/block/mtip32xx/mtip32xx.h +++ b/drivers/block/mtip32xx/mtip32xx.h @@ -52,6 +52,9 @@ #define MTIP_FTL_REBUILD_MAGIC 0xED51 #define MTIP_FTL_REBUILD_TIMEOUT_MS 2400000 +/* unaligned IO handling */ +#define MTIP_MAX_UNALIGNED_SLOTS 8 + /* Macro to extract the tag bit number from a tag value. */ #define MTIP_TAG_BIT(tag) (tag & 0x1F) @@ -333,6 +336,8 @@ struct mtip_cmd { int scatter_ents; /* Number of scatter list entries used */ + int unaligned; /* command is unaligned on 4k boundary */ + struct scatterlist sg[MTIP_MAX_SG]; /* Scatter list entries */ int retries; /* The number of retries left for this command. */ @@ -452,6 +457,10 @@ struct mtip_port { * command slots available. */ struct semaphore cmd_slot; + + /* Semaphore to control queue depth of unaligned IOs */ + struct semaphore cmd_slot_unal; + /* Spinlock for working around command-issue bug. */ spinlock_t cmd_issue_lock[MTIP_MAX_SLOT_GROUPS]; }; @@ -502,6 +511,8 @@ struct driver_data { int isr_binding; + int unal_qdepth; /* qdepth of unaligned IO queue */ + struct list_head online_list; /* linkage for online list */ struct list_head remove_list; /* linkage for removing list */ diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c new file mode 100644 index 000000000000..8efdfaa44a59 --- /dev/null +++ b/drivers/block/nvme-core.c @@ -0,0 +1,2073 @@ +/* + * NVM Express device driver + * Copyright (c) 2011, Intel Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define NVME_Q_DEPTH 1024 +#define SQ_SIZE(depth) (depth * sizeof(struct nvme_command)) +#define CQ_SIZE(depth) (depth * sizeof(struct nvme_completion)) +#define NVME_MINORS 64 +#define ADMIN_TIMEOUT (60 * HZ) + +static int nvme_major; +module_param(nvme_major, int, 0); + +static int use_threaded_interrupts; +module_param(use_threaded_interrupts, int, 0); + +static DEFINE_SPINLOCK(dev_list_lock); +static LIST_HEAD(dev_list); +static struct task_struct *nvme_thread; + +/* + * An NVM Express queue. Each device has at least two (one for admin + * commands and one for I/O commands). + */ +struct nvme_queue { + struct device *q_dmadev; + struct nvme_dev *dev; + spinlock_t q_lock; + struct nvme_command *sq_cmds; + volatile struct nvme_completion *cqes; + dma_addr_t sq_dma_addr; + dma_addr_t cq_dma_addr; + wait_queue_head_t sq_full; + wait_queue_t sq_cong_wait; + struct bio_list sq_cong; + u32 __iomem *q_db; + u16 q_depth; + u16 cq_vector; + u16 sq_head; + u16 sq_tail; + u16 cq_head; + u16 cq_phase; + unsigned long cmdid_data[]; +}; + +/* + * Check we didin't inadvertently grow the command struct + */ +static inline void _nvme_check_size(void) +{ + BUILD_BUG_ON(sizeof(struct nvme_rw_command) != 64); + BUILD_BUG_ON(sizeof(struct nvme_create_cq) != 64); + BUILD_BUG_ON(sizeof(struct nvme_create_sq) != 64); + BUILD_BUG_ON(sizeof(struct nvme_delete_queue) != 64); + BUILD_BUG_ON(sizeof(struct nvme_features) != 64); + BUILD_BUG_ON(sizeof(struct nvme_format_cmd) != 64); + BUILD_BUG_ON(sizeof(struct nvme_command) != 64); + BUILD_BUG_ON(sizeof(struct nvme_id_ctrl) != 4096); + BUILD_BUG_ON(sizeof(struct nvme_id_ns) != 4096); + BUILD_BUG_ON(sizeof(struct nvme_lba_range_type) != 64); + BUILD_BUG_ON(sizeof(struct nvme_smart_log) != 512); +} + +typedef void (*nvme_completion_fn)(struct nvme_dev *, void *, + struct nvme_completion *); + +struct nvme_cmd_info { + nvme_completion_fn fn; + void *ctx; + unsigned long timeout; +}; + +static struct nvme_cmd_info *nvme_cmd_info(struct nvme_queue *nvmeq) +{ + return (void *)&nvmeq->cmdid_data[BITS_TO_LONGS(nvmeq->q_depth)]; +} + +/** + * alloc_cmdid() - Allocate a Command ID + * @nvmeq: The queue that will be used for this command + * @ctx: A pointer that will be passed to the handler + * @handler: The function to call on completion + * + * Allocate a Command ID for a queue. The data passed in will + * be passed to the completion handler. This is implemented by using + * the bottom two bits of the ctx pointer to store the handler ID. + * Passing in a pointer that's not 4-byte aligned will cause a BUG. + * We can change this if it becomes a problem. + * + * May be called with local interrupts disabled and the q_lock held, + * or with interrupts enabled and no locks held. + */ +static int alloc_cmdid(struct nvme_queue *nvmeq, void *ctx, + nvme_completion_fn handler, unsigned timeout) +{ + int depth = nvmeq->q_depth - 1; + struct nvme_cmd_info *info = nvme_cmd_info(nvmeq); + int cmdid; + + do { + cmdid = find_first_zero_bit(nvmeq->cmdid_data, depth); + if (cmdid >= depth) + return -EBUSY; + } while (test_and_set_bit(cmdid, nvmeq->cmdid_data)); + + info[cmdid].fn = handler; + info[cmdid].ctx = ctx; + info[cmdid].timeout = jiffies + timeout; + return cmdid; +} + +static int alloc_cmdid_killable(struct nvme_queue *nvmeq, void *ctx, + nvme_completion_fn handler, unsigned timeout) +{ + int cmdid; + wait_event_killable(nvmeq->sq_full, + (cmdid = alloc_cmdid(nvmeq, ctx, handler, timeout)) >= 0); + return (cmdid < 0) ? -EINTR : cmdid; +} + +/* Special values must be less than 0x1000 */ +#define CMD_CTX_BASE ((void *)POISON_POINTER_DELTA) +#define CMD_CTX_CANCELLED (0x30C + CMD_CTX_BASE) +#define CMD_CTX_COMPLETED (0x310 + CMD_CTX_BASE) +#define CMD_CTX_INVALID (0x314 + CMD_CTX_BASE) +#define CMD_CTX_FLUSH (0x318 + CMD_CTX_BASE) + +static void special_completion(struct nvme_dev *dev, void *ctx, + struct nvme_completion *cqe) +{ + if (ctx == CMD_CTX_CANCELLED) + return; + if (ctx == CMD_CTX_FLUSH) + return; + if (ctx == CMD_CTX_COMPLETED) { + dev_warn(&dev->pci_dev->dev, + "completed id %d twice on queue %d\n", + cqe->command_id, le16_to_cpup(&cqe->sq_id)); + return; + } + if (ctx == CMD_CTX_INVALID) { + dev_warn(&dev->pci_dev->dev, + "invalid id %d completed on queue %d\n", + cqe->command_id, le16_to_cpup(&cqe->sq_id)); + return; + } + + dev_warn(&dev->pci_dev->dev, "Unknown special completion %p\n", ctx); +} + +/* + * Called with local interrupts disabled and the q_lock held. May not sleep. + */ +static void *free_cmdid(struct nvme_queue *nvmeq, int cmdid, + nvme_completion_fn *fn) +{ + void *ctx; + struct nvme_cmd_info *info = nvme_cmd_info(nvmeq); + + if (cmdid >= nvmeq->q_depth) { + *fn = special_completion; + return CMD_CTX_INVALID; + } + if (fn) + *fn = info[cmdid].fn; + ctx = info[cmdid].ctx; + info[cmdid].fn = special_completion; + info[cmdid].ctx = CMD_CTX_COMPLETED; + clear_bit(cmdid, nvmeq->cmdid_data); + wake_up(&nvmeq->sq_full); + return ctx; +} + +static void *cancel_cmdid(struct nvme_queue *nvmeq, int cmdid, + nvme_completion_fn *fn) +{ + void *ctx; + struct nvme_cmd_info *info = nvme_cmd_info(nvmeq); + if (fn) + *fn = info[cmdid].fn; + ctx = info[cmdid].ctx; + info[cmdid].fn = special_completion; + info[cmdid].ctx = CMD_CTX_CANCELLED; + return ctx; +} + +struct nvme_queue *get_nvmeq(struct nvme_dev *dev) +{ + return dev->queues[get_cpu() + 1]; +} + +void put_nvmeq(struct nvme_queue *nvmeq) +{ + put_cpu(); +} + +/** + * nvme_submit_cmd() - Copy a command into a queue and ring the doorbell + * @nvmeq: The queue to use + * @cmd: The command to send + * + * Safe to use from interrupt context + */ +static int nvme_submit_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd) +{ + unsigned long flags; + u16 tail; + spin_lock_irqsave(&nvmeq->q_lock, flags); + tail = nvmeq->sq_tail; + memcpy(&nvmeq->sq_cmds[tail], cmd, sizeof(*cmd)); + if (++tail == nvmeq->q_depth) + tail = 0; + writel(tail, nvmeq->q_db); + nvmeq->sq_tail = tail; + spin_unlock_irqrestore(&nvmeq->q_lock, flags); + + return 0; +} + +static __le64 **iod_list(struct nvme_iod *iod) +{ + return ((void *)iod) + iod->offset; +} + +/* + * Will slightly overestimate the number of pages needed. This is OK + * as it only leads to a small amount of wasted memory for the lifetime of + * the I/O. + */ +static int nvme_npages(unsigned size) +{ + unsigned nprps = DIV_ROUND_UP(size + PAGE_SIZE, PAGE_SIZE); + return DIV_ROUND_UP(8 * nprps, PAGE_SIZE - 8); +} + +static struct nvme_iod * +nvme_alloc_iod(unsigned nseg, unsigned nbytes, gfp_t gfp) +{ + struct nvme_iod *iod = kmalloc(sizeof(struct nvme_iod) + + sizeof(__le64 *) * nvme_npages(nbytes) + + sizeof(struct scatterlist) * nseg, gfp); + + if (iod) { + iod->offset = offsetof(struct nvme_iod, sg[nseg]); + iod->npages = -1; + iod->length = nbytes; + iod->nents = 0; + } + + return iod; +} + +void nvme_free_iod(struct nvme_dev *dev, struct nvme_iod *iod) +{ + const int last_prp = PAGE_SIZE / 8 - 1; + int i; + __le64 **list = iod_list(iod); + dma_addr_t prp_dma = iod->first_dma; + + if (iod->npages == 0) + dma_pool_free(dev->prp_small_pool, list[0], prp_dma); + for (i = 0; i < iod->npages; i++) { + __le64 *prp_list = list[i]; + dma_addr_t next_prp_dma = le64_to_cpu(prp_list[last_prp]); + dma_pool_free(dev->prp_page_pool, prp_list, prp_dma); + prp_dma = next_prp_dma; + } + kfree(iod); +} + +static void bio_completion(struct nvme_dev *dev, void *ctx, + struct nvme_completion *cqe) +{ + struct nvme_iod *iod = ctx; + struct bio *bio = iod->private; + u16 status = le16_to_cpup(&cqe->status) >> 1; + + if (iod->nents) + dma_unmap_sg(&dev->pci_dev->dev, iod->sg, iod->nents, + bio_data_dir(bio) ? DMA_TO_DEVICE : DMA_FROM_DEVICE); + nvme_free_iod(dev, iod); + if (status) + bio_endio(bio, -EIO); + else + bio_endio(bio, 0); +} + +/* length is in bytes. gfp flags indicates whether we may sleep. */ +int nvme_setup_prps(struct nvme_dev *dev, struct nvme_common_command *cmd, + struct nvme_iod *iod, int total_len, gfp_t gfp) +{ + struct dma_pool *pool; + int length = total_len; + struct scatterlist *sg = iod->sg; + int dma_len = sg_dma_len(sg); + u64 dma_addr = sg_dma_address(sg); + int offset = offset_in_page(dma_addr); + __le64 *prp_list; + __le64 **list = iod_list(iod); + dma_addr_t prp_dma; + int nprps, i; + + cmd->prp1 = cpu_to_le64(dma_addr); + length -= (PAGE_SIZE - offset); + if (length <= 0) + return total_len; + + dma_len -= (PAGE_SIZE - offset); + if (dma_len) { + dma_addr += (PAGE_SIZE - offset); + } else { + sg = sg_next(sg); + dma_addr = sg_dma_address(sg); + dma_len = sg_dma_len(sg); + } + + if (length <= PAGE_SIZE) { + cmd->prp2 = cpu_to_le64(dma_addr); + return total_len; + } + + nprps = DIV_ROUND_UP(length, PAGE_SIZE); + if (nprps <= (256 / 8)) { + pool = dev->prp_small_pool; + iod->npages = 0; + } else { + pool = dev->prp_page_pool; + iod->npages = 1; + } + + prp_list = dma_pool_alloc(pool, gfp, &prp_dma); + if (!prp_list) { + cmd->prp2 = cpu_to_le64(dma_addr); + iod->npages = -1; + return (total_len - length) + PAGE_SIZE; + } + list[0] = prp_list; + iod->first_dma = prp_dma; + cmd->prp2 = cpu_to_le64(prp_dma); + i = 0; + for (;;) { + if (i == PAGE_SIZE / 8) { + __le64 *old_prp_list = prp_list; + prp_list = dma_pool_alloc(pool, gfp, &prp_dma); + if (!prp_list) + return total_len - length; + list[iod->npages++] = prp_list; + prp_list[0] = old_prp_list[i - 1]; + old_prp_list[i - 1] = cpu_to_le64(prp_dma); + i = 1; + } + prp_list[i++] = cpu_to_le64(dma_addr); + dma_len -= PAGE_SIZE; + dma_addr += PAGE_SIZE; + length -= PAGE_SIZE; + if (length <= 0) + break; + if (dma_len > 0) + continue; + BUG_ON(dma_len < 0); + sg = sg_next(sg); + dma_addr = sg_dma_address(sg); + dma_len = sg_dma_len(sg); + } + + return total_len; +} + +struct nvme_bio_pair { + struct bio b1, b2, *parent; + struct bio_vec *bv1, *bv2; + int err; + atomic_t cnt; +}; + +static void nvme_bio_pair_endio(struct bio *bio, int err) +{ + struct nvme_bio_pair *bp = bio->bi_private; + + if (err) + bp->err = err; + + if (atomic_dec_and_test(&bp->cnt)) { + bio_endio(bp->parent, bp->err); + if (bp->bv1) + kfree(bp->bv1); + if (bp->bv2) + kfree(bp->bv2); + kfree(bp); + } +} + +static struct nvme_bio_pair *nvme_bio_split(struct bio *bio, int idx, + int len, int offset) +{ + struct nvme_bio_pair *bp; + + BUG_ON(len > bio->bi_size); + BUG_ON(idx > bio->bi_vcnt); + + bp = kmalloc(sizeof(*bp), GFP_ATOMIC); + if (!bp) + return NULL; + bp->err = 0; + + bp->b1 = *bio; + bp->b2 = *bio; + + bp->b1.bi_size = len; + bp->b2.bi_size -= len; + bp->b1.bi_vcnt = idx; + bp->b2.bi_idx = idx; + bp->b2.bi_sector += len >> 9; + + if (offset) { + bp->bv1 = kmalloc(bio->bi_max_vecs * sizeof(struct bio_vec), + GFP_ATOMIC); + if (!bp->bv1) + goto split_fail_1; + + bp->bv2 = kmalloc(bio->bi_max_vecs * sizeof(struct bio_vec), + GFP_ATOMIC); + if (!bp->bv2) + goto split_fail_2; + + memcpy(bp->bv1, bio->bi_io_vec, + bio->bi_max_vecs * sizeof(struct bio_vec)); + memcpy(bp->bv2, bio->bi_io_vec, + bio->bi_max_vecs * sizeof(struct bio_vec)); + + bp->b1.bi_io_vec = bp->bv1; + bp->b2.bi_io_vec = bp->bv2; + bp->b2.bi_io_vec[idx].bv_offset += offset; + bp->b2.bi_io_vec[idx].bv_len -= offset; + bp->b1.bi_io_vec[idx].bv_len = offset; + bp->b1.bi_vcnt++; + } else + bp->bv1 = bp->bv2 = NULL; + + bp->b1.bi_private = bp; + bp->b2.bi_private = bp; + + bp->b1.bi_end_io = nvme_bio_pair_endio; + bp->b2.bi_end_io = nvme_bio_pair_endio; + + bp->parent = bio; + atomic_set(&bp->cnt, 2); + + return bp; + + split_fail_2: + kfree(bp->bv1); + split_fail_1: + kfree(bp); + return NULL; +} + +static int nvme_split_and_submit(struct bio *bio, struct nvme_queue *nvmeq, + int idx, int len, int offset) +{ + struct nvme_bio_pair *bp = nvme_bio_split(bio, idx, len, offset); + if (!bp) + return -ENOMEM; + + if (bio_list_empty(&nvmeq->sq_cong)) + add_wait_queue(&nvmeq->sq_full, &nvmeq->sq_cong_wait); + bio_list_add(&nvmeq->sq_cong, &bp->b1); + bio_list_add(&nvmeq->sq_cong, &bp->b2); + + return 0; +} + +/* NVMe scatterlists require no holes in the virtual address */ +#define BIOVEC_NOT_VIRT_MERGEABLE(vec1, vec2) ((vec2)->bv_offset || \ + (((vec1)->bv_offset + (vec1)->bv_len) % PAGE_SIZE)) + +static int nvme_map_bio(struct nvme_queue *nvmeq, struct nvme_iod *iod, + struct bio *bio, enum dma_data_direction dma_dir, int psegs) +{ + struct bio_vec *bvec, *bvprv = NULL; + struct scatterlist *sg = NULL; + int i, length = 0, nsegs = 0, split_len = bio->bi_size; + + if (nvmeq->dev->stripe_size) + split_len = nvmeq->dev->stripe_size - + ((bio->bi_sector << 9) & (nvmeq->dev->stripe_size - 1)); + + sg_init_table(iod->sg, psegs); + bio_for_each_segment(bvec, bio, i) { + if (bvprv && BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) { + sg->length += bvec->bv_len; + } else { + if (bvprv && BIOVEC_NOT_VIRT_MERGEABLE(bvprv, bvec)) + return nvme_split_and_submit(bio, nvmeq, i, + length, 0); + + sg = sg ? sg + 1 : iod->sg; + sg_set_page(sg, bvec->bv_page, bvec->bv_len, + bvec->bv_offset); + nsegs++; + } + + if (split_len - length < bvec->bv_len) + return nvme_split_and_submit(bio, nvmeq, i, split_len, + split_len - length); + length += bvec->bv_len; + bvprv = bvec; + } + iod->nents = nsegs; + sg_mark_end(sg); + if (dma_map_sg(nvmeq->q_dmadev, iod->sg, iod->nents, dma_dir) == 0) + return -ENOMEM; + + BUG_ON(length != bio->bi_size); + return length; +} + +/* + * We reuse the small pool to allocate the 16-byte range here as it is not + * worth having a special pool for these or additional cases to handle freeing + * the iod. + */ +static int nvme_submit_discard(struct nvme_queue *nvmeq, struct nvme_ns *ns, + struct bio *bio, struct nvme_iod *iod, int cmdid) +{ + struct nvme_dsm_range *range; + struct nvme_command *cmnd = &nvmeq->sq_cmds[nvmeq->sq_tail]; + + range = dma_pool_alloc(nvmeq->dev->prp_small_pool, GFP_ATOMIC, + &iod->first_dma); + if (!range) + return -ENOMEM; + + iod_list(iod)[0] = (__le64 *)range; + iod->npages = 0; + + range->cattr = cpu_to_le32(0); + range->nlb = cpu_to_le32(bio->bi_size >> ns->lba_shift); + range->slba = cpu_to_le64(nvme_block_nr(ns, bio->bi_sector)); + + memset(cmnd, 0, sizeof(*cmnd)); + cmnd->dsm.opcode = nvme_cmd_dsm; + cmnd->dsm.command_id = cmdid; + cmnd->dsm.nsid = cpu_to_le32(ns->ns_id); + cmnd->dsm.prp1 = cpu_to_le64(iod->first_dma); + cmnd->dsm.nr = 0; + cmnd->dsm.attributes = cpu_to_le32(NVME_DSMGMT_AD); + + if (++nvmeq->sq_tail == nvmeq->q_depth) + nvmeq->sq_tail = 0; + writel(nvmeq->sq_tail, nvmeq->q_db); + + return 0; +} + +static int nvme_submit_flush(struct nvme_queue *nvmeq, struct nvme_ns *ns, + int cmdid) +{ + struct nvme_command *cmnd = &nvmeq->sq_cmds[nvmeq->sq_tail]; + + memset(cmnd, 0, sizeof(*cmnd)); + cmnd->common.opcode = nvme_cmd_flush; + cmnd->common.command_id = cmdid; + cmnd->common.nsid = cpu_to_le32(ns->ns_id); + + if (++nvmeq->sq_tail == nvmeq->q_depth) + nvmeq->sq_tail = 0; + writel(nvmeq->sq_tail, nvmeq->q_db); + + return 0; +} + +int nvme_submit_flush_data(struct nvme_queue *nvmeq, struct nvme_ns *ns) +{ + int cmdid = alloc_cmdid(nvmeq, (void *)CMD_CTX_FLUSH, + special_completion, NVME_IO_TIMEOUT); + if (unlikely(cmdid < 0)) + return cmdid; + + return nvme_submit_flush(nvmeq, ns, cmdid); +} + +/* + * Called with local interrupts disabled and the q_lock held. May not sleep. + */ +static int nvme_submit_bio_queue(struct nvme_queue *nvmeq, struct nvme_ns *ns, + struct bio *bio) +{ + struct nvme_command *cmnd; + struct nvme_iod *iod; + enum dma_data_direction dma_dir; + int cmdid, length, result = -ENOMEM; + u16 control; + u32 dsmgmt; + int psegs = bio_phys_segments(ns->queue, bio); + + if ((bio->bi_rw & REQ_FLUSH) && psegs) { + result = nvme_submit_flush_data(nvmeq, ns); + if (result) + return result; + } + + iod = nvme_alloc_iod(psegs, bio->bi_size, GFP_ATOMIC); + if (!iod) + goto nomem; + iod->private = bio; + + result = -EBUSY; + cmdid = alloc_cmdid(nvmeq, iod, bio_completion, NVME_IO_TIMEOUT); + if (unlikely(cmdid < 0)) + goto free_iod; + + if (bio->bi_rw & REQ_DISCARD) { + result = nvme_submit_discard(nvmeq, ns, bio, iod, cmdid); + if (result) + goto free_cmdid; + return result; + } + if ((bio->bi_rw & REQ_FLUSH) && !psegs) + return nvme_submit_flush(nvmeq, ns, cmdid); + + control = 0; + if (bio->bi_rw & REQ_FUA) + control |= NVME_RW_FUA; + if (bio->bi_rw & (REQ_FAILFAST_DEV | REQ_RAHEAD)) + control |= NVME_RW_LR; + + dsmgmt = 0; + if (bio->bi_rw & REQ_RAHEAD) + dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH; + + cmnd = &nvmeq->sq_cmds[nvmeq->sq_tail]; + + memset(cmnd, 0, sizeof(*cmnd)); + if (bio_data_dir(bio)) { + cmnd->rw.opcode = nvme_cmd_write; + dma_dir = DMA_TO_DEVICE; + } else { + cmnd->rw.opcode = nvme_cmd_read; + dma_dir = DMA_FROM_DEVICE; + } + + result = nvme_map_bio(nvmeq, iod, bio, dma_dir, psegs); + if (result <= 0) + goto free_cmdid; + length = result; + + cmnd->rw.command_id = cmdid; + cmnd->rw.nsid = cpu_to_le32(ns->ns_id); + length = nvme_setup_prps(nvmeq->dev, &cmnd->common, iod, length, + GFP_ATOMIC); + cmnd->rw.slba = cpu_to_le64(nvme_block_nr(ns, bio->bi_sector)); + cmnd->rw.length = cpu_to_le16((length >> ns->lba_shift) - 1); + cmnd->rw.control = cpu_to_le16(control); + cmnd->rw.dsmgmt = cpu_to_le32(dsmgmt); + + if (++nvmeq->sq_tail == nvmeq->q_depth) + nvmeq->sq_tail = 0; + writel(nvmeq->sq_tail, nvmeq->q_db); + + return 0; + + free_cmdid: + free_cmdid(nvmeq, cmdid, NULL); + free_iod: + nvme_free_iod(nvmeq->dev, iod); + nomem: + return result; +} + +static void nvme_make_request(struct request_queue *q, struct bio *bio) +{ + struct nvme_ns *ns = q->queuedata; + struct nvme_queue *nvmeq = get_nvmeq(ns->dev); + int result = -EBUSY; + + spin_lock_irq(&nvmeq->q_lock); + if (bio_list_empty(&nvmeq->sq_cong)) + result = nvme_submit_bio_queue(nvmeq, ns, bio); + if (unlikely(result)) { + if (bio_list_empty(&nvmeq->sq_cong)) + add_wait_queue(&nvmeq->sq_full, &nvmeq->sq_cong_wait); + bio_list_add(&nvmeq->sq_cong, bio); + } + + spin_unlock_irq(&nvmeq->q_lock); + put_nvmeq(nvmeq); +} + +static irqreturn_t nvme_process_cq(struct nvme_queue *nvmeq) +{ + u16 head, phase; + + head = nvmeq->cq_head; + phase = nvmeq->cq_phase; + + for (;;) { + void *ctx; + nvme_completion_fn fn; + struct nvme_completion cqe = nvmeq->cqes[head]; + if ((le16_to_cpu(cqe.status) & 1) != phase) + break; + nvmeq->sq_head = le16_to_cpu(cqe.sq_head); + if (++head == nvmeq->q_depth) { + head = 0; + phase = !phase; + } + + ctx = free_cmdid(nvmeq, cqe.command_id, &fn); + fn(nvmeq->dev, ctx, &cqe); + } + + /* If the controller ignores the cq head doorbell and continuously + * writes to the queue, it is theoretically possible to wrap around + * the queue twice and mistakenly return IRQ_NONE. Linux only + * requires that 0.1% of your interrupts are handled, so this isn't + * a big problem. + */ + if (head == nvmeq->cq_head && phase == nvmeq->cq_phase) + return IRQ_NONE; + + writel(head, nvmeq->q_db + (1 << nvmeq->dev->db_stride)); + nvmeq->cq_head = head; + nvmeq->cq_phase = phase; + + return IRQ_HANDLED; +} + +static irqreturn_t nvme_irq(int irq, void *data) +{ + irqreturn_t result; + struct nvme_queue *nvmeq = data; + spin_lock(&nvmeq->q_lock); + result = nvme_process_cq(nvmeq); + spin_unlock(&nvmeq->q_lock); + return result; +} + +static irqreturn_t nvme_irq_check(int irq, void *data) +{ + struct nvme_queue *nvmeq = data; + struct nvme_completion cqe = nvmeq->cqes[nvmeq->cq_head]; + if ((le16_to_cpu(cqe.status) & 1) != nvmeq->cq_phase) + return IRQ_NONE; + return IRQ_WAKE_THREAD; +} + +static void nvme_abort_command(struct nvme_queue *nvmeq, int cmdid) +{ + spin_lock_irq(&nvmeq->q_lock); + cancel_cmdid(nvmeq, cmdid, NULL); + spin_unlock_irq(&nvmeq->q_lock); +} + +struct sync_cmd_info { + struct task_struct *task; + u32 result; + int status; +}; + +static void sync_completion(struct nvme_dev *dev, void *ctx, + struct nvme_completion *cqe) +{ + struct sync_cmd_info *cmdinfo = ctx; + cmdinfo->result = le32_to_cpup(&cqe->result); + cmdinfo->status = le16_to_cpup(&cqe->status) >> 1; + wake_up_process(cmdinfo->task); +} + +/* + * Returns 0 on success. If the result is negative, it's a Linux error code; + * if the result is positive, it's an NVM Express status code + */ +int nvme_submit_sync_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd, + u32 *result, unsigned timeout) +{ + int cmdid; + struct sync_cmd_info cmdinfo; + + cmdinfo.task = current; + cmdinfo.status = -EINTR; + + cmdid = alloc_cmdid_killable(nvmeq, &cmdinfo, sync_completion, + timeout); + if (cmdid < 0) + return cmdid; + cmd->common.command_id = cmdid; + + set_current_state(TASK_KILLABLE); + nvme_submit_cmd(nvmeq, cmd); + schedule_timeout(timeout); + + if (cmdinfo.status == -EINTR) { + nvme_abort_command(nvmeq, cmdid); + return -EINTR; + } + + if (result) + *result = cmdinfo.result; + + return cmdinfo.status; +} + +int nvme_submit_admin_cmd(struct nvme_dev *dev, struct nvme_command *cmd, + u32 *result) +{ + return nvme_submit_sync_cmd(dev->queues[0], cmd, result, ADMIN_TIMEOUT); +} + +static int adapter_delete_queue(struct nvme_dev *dev, u8 opcode, u16 id) +{ + int status; + struct nvme_command c; + + memset(&c, 0, sizeof(c)); + c.delete_queue.opcode = opcode; + c.delete_queue.qid = cpu_to_le16(id); + + status = nvme_submit_admin_cmd(dev, &c, NULL); + if (status) + return -EIO; + return 0; +} + +static int adapter_alloc_cq(struct nvme_dev *dev, u16 qid, + struct nvme_queue *nvmeq) +{ + int status; + struct nvme_command c; + int flags = NVME_QUEUE_PHYS_CONTIG | NVME_CQ_IRQ_ENABLED; + + memset(&c, 0, sizeof(c)); + c.create_cq.opcode = nvme_admin_create_cq; + c.create_cq.prp1 = cpu_to_le64(nvmeq->cq_dma_addr); + c.create_cq.cqid = cpu_to_le16(qid); + c.create_cq.qsize = cpu_to_le16(nvmeq->q_depth - 1); + c.create_cq.cq_flags = cpu_to_le16(flags); + c.create_cq.irq_vector = cpu_to_le16(nvmeq->cq_vector); + + status = nvme_submit_admin_cmd(dev, &c, NULL); + if (status) + return -EIO; + return 0; +} + +static int adapter_alloc_sq(struct nvme_dev *dev, u16 qid, + struct nvme_queue *nvmeq) +{ + int status; + struct nvme_command c; + int flags = NVME_QUEUE_PHYS_CONTIG | NVME_SQ_PRIO_MEDIUM; + + memset(&c, 0, sizeof(c)); + c.create_sq.opcode = nvme_admin_create_sq; + c.create_sq.prp1 = cpu_to_le64(nvmeq->sq_dma_addr); + c.create_sq.sqid = cpu_to_le16(qid); + c.create_sq.qsize = cpu_to_le16(nvmeq->q_depth - 1); + c.create_sq.sq_flags = cpu_to_le16(flags); + c.create_sq.cqid = cpu_to_le16(qid); + + status = nvme_submit_admin_cmd(dev, &c, NULL); + if (status) + return -EIO; + return 0; +} + +static int adapter_delete_cq(struct nvme_dev *dev, u16 cqid) +{ + return adapter_delete_queue(dev, nvme_admin_delete_cq, cqid); +} + +static int adapter_delete_sq(struct nvme_dev *dev, u16 sqid) +{ + return adapter_delete_queue(dev, nvme_admin_delete_sq, sqid); +} + +int nvme_identify(struct nvme_dev *dev, unsigned nsid, unsigned cns, + dma_addr_t dma_addr) +{ + struct nvme_command c; + + memset(&c, 0, sizeof(c)); + c.identify.opcode = nvme_admin_identify; + c.identify.nsid = cpu_to_le32(nsid); + c.identify.prp1 = cpu_to_le64(dma_addr); + c.identify.cns = cpu_to_le32(cns); + + return nvme_submit_admin_cmd(dev, &c, NULL); +} + +int nvme_get_features(struct nvme_dev *dev, unsigned fid, unsigned nsid, + dma_addr_t dma_addr, u32 *result) +{ + struct nvme_command c; + + memset(&c, 0, sizeof(c)); + c.features.opcode = nvme_admin_get_features; + c.features.nsid = cpu_to_le32(nsid); + c.features.prp1 = cpu_to_le64(dma_addr); + c.features.fid = cpu_to_le32(fid); + + return nvme_submit_admin_cmd(dev, &c, result); +} + +int nvme_set_features(struct nvme_dev *dev, unsigned fid, unsigned dword11, + dma_addr_t dma_addr, u32 *result) +{ + struct nvme_command c; + + memset(&c, 0, sizeof(c)); + c.features.opcode = nvme_admin_set_features; + c.features.prp1 = cpu_to_le64(dma_addr); + c.features.fid = cpu_to_le32(fid); + c.features.dword11 = cpu_to_le32(dword11); + + return nvme_submit_admin_cmd(dev, &c, result); +} + +/** + * nvme_cancel_ios - Cancel outstanding I/Os + * @queue: The queue to cancel I/Os on + * @timeout: True to only cancel I/Os which have timed out + */ +static void nvme_cancel_ios(struct nvme_queue *nvmeq, bool timeout) +{ + int depth = nvmeq->q_depth - 1; + struct nvme_cmd_info *info = nvme_cmd_info(nvmeq); + unsigned long now = jiffies; + int cmdid; + + for_each_set_bit(cmdid, nvmeq->cmdid_data, depth) { + void *ctx; + nvme_completion_fn fn; + static struct nvme_completion cqe = { + .status = cpu_to_le16(NVME_SC_ABORT_REQ << 1), + }; + + if (timeout && !time_after(now, info[cmdid].timeout)) + continue; + dev_warn(nvmeq->q_dmadev, "Cancelling I/O %d\n", cmdid); + ctx = cancel_cmdid(nvmeq, cmdid, &fn); + fn(nvmeq->dev, ctx, &cqe); + } +} + +static void nvme_free_queue_mem(struct nvme_queue *nvmeq) +{ + dma_free_coherent(nvmeq->q_dmadev, CQ_SIZE(nvmeq->q_depth), + (void *)nvmeq->cqes, nvmeq->cq_dma_addr); + dma_free_coherent(nvmeq->q_dmadev, SQ_SIZE(nvmeq->q_depth), + nvmeq->sq_cmds, nvmeq->sq_dma_addr); + kfree(nvmeq); +} + +static void nvme_free_queue(struct nvme_dev *dev, int qid) +{ + struct nvme_queue *nvmeq = dev->queues[qid]; + int vector = dev->entry[nvmeq->cq_vector].vector; + + spin_lock_irq(&nvmeq->q_lock); + nvme_cancel_ios(nvmeq, false); + while (bio_list_peek(&nvmeq->sq_cong)) { + struct bio *bio = bio_list_pop(&nvmeq->sq_cong); + bio_endio(bio, -EIO); + } + spin_unlock_irq(&nvmeq->q_lock); + + irq_set_affinity_hint(vector, NULL); + free_irq(vector, nvmeq); + + /* Don't tell the adapter to delete the admin queue */ + if (qid) { + adapter_delete_sq(dev, qid); + adapter_delete_cq(dev, qid); + } + + nvme_free_queue_mem(nvmeq); +} + +static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev, int qid, + int depth, int vector) +{ + struct device *dmadev = &dev->pci_dev->dev; + unsigned extra = DIV_ROUND_UP(depth, 8) + (depth * + sizeof(struct nvme_cmd_info)); + struct nvme_queue *nvmeq = kzalloc(sizeof(*nvmeq) + extra, GFP_KERNEL); + if (!nvmeq) + return NULL; + + nvmeq->cqes = dma_alloc_coherent(dmadev, CQ_SIZE(depth), + &nvmeq->cq_dma_addr, GFP_KERNEL); + if (!nvmeq->cqes) + goto free_nvmeq; + memset((void *)nvmeq->cqes, 0, CQ_SIZE(depth)); + + nvmeq->sq_cmds = dma_alloc_coherent(dmadev, SQ_SIZE(depth), + &nvmeq->sq_dma_addr, GFP_KERNEL); + if (!nvmeq->sq_cmds) + goto free_cqdma; + + nvmeq->q_dmadev = dmadev; + nvmeq->dev = dev; + spin_lock_init(&nvmeq->q_lock); + nvmeq->cq_head = 0; + nvmeq->cq_phase = 1; + init_waitqueue_head(&nvmeq->sq_full); + init_waitqueue_entry(&nvmeq->sq_cong_wait, nvme_thread); + bio_list_init(&nvmeq->sq_cong); + nvmeq->q_db = &dev->dbs[qid << (dev->db_stride + 1)]; + nvmeq->q_depth = depth; + nvmeq->cq_vector = vector; + + return nvmeq; + + free_cqdma: + dma_free_coherent(dmadev, CQ_SIZE(depth), (void *)nvmeq->cqes, + nvmeq->cq_dma_addr); + free_nvmeq: + kfree(nvmeq); + return NULL; +} + +static int queue_request_irq(struct nvme_dev *dev, struct nvme_queue *nvmeq, + const char *name) +{ + if (use_threaded_interrupts) + return request_threaded_irq(dev->entry[nvmeq->cq_vector].vector, + nvme_irq_check, nvme_irq, + IRQF_DISABLED | IRQF_SHARED, + name, nvmeq); + return request_irq(dev->entry[nvmeq->cq_vector].vector, nvme_irq, + IRQF_DISABLED | IRQF_SHARED, name, nvmeq); +} + +static struct nvme_queue *nvme_create_queue(struct nvme_dev *dev, int qid, + int cq_size, int vector) +{ + int result; + struct nvme_queue *nvmeq = nvme_alloc_queue(dev, qid, cq_size, vector); + + if (!nvmeq) + return ERR_PTR(-ENOMEM); + + result = adapter_alloc_cq(dev, qid, nvmeq); + if (result < 0) + goto free_nvmeq; + + result = adapter_alloc_sq(dev, qid, nvmeq); + if (result < 0) + goto release_cq; + + result = queue_request_irq(dev, nvmeq, "nvme"); + if (result < 0) + goto release_sq; + + return nvmeq; + + release_sq: + adapter_delete_sq(dev, qid); + release_cq: + adapter_delete_cq(dev, qid); + free_nvmeq: + dma_free_coherent(nvmeq->q_dmadev, CQ_SIZE(nvmeq->q_depth), + (void *)nvmeq->cqes, nvmeq->cq_dma_addr); + dma_free_coherent(nvmeq->q_dmadev, SQ_SIZE(nvmeq->q_depth), + nvmeq->sq_cmds, nvmeq->sq_dma_addr); + kfree(nvmeq); + return ERR_PTR(result); +} + +static int nvme_wait_ready(struct nvme_dev *dev, u64 cap, bool enabled) +{ + unsigned long timeout; + u32 bit = enabled ? NVME_CSTS_RDY : 0; + + timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies; + + while ((readl(&dev->bar->csts) & NVME_CSTS_RDY) != bit) { + msleep(100); + if (fatal_signal_pending(current)) + return -EINTR; + if (time_after(jiffies, timeout)) { + dev_err(&dev->pci_dev->dev, + "Device not ready; aborting initialisation\n"); + return -ENODEV; + } + } + + return 0; +} + +/* + * If the device has been passed off to us in an enabled state, just clear + * the enabled bit. The spec says we should set the 'shutdown notification + * bits', but doing so may cause the device to complete commands to the + * admin queue ... and we don't know what memory that might be pointing at! + */ +static int nvme_disable_ctrl(struct nvme_dev *dev, u64 cap) +{ + u32 cc = readl(&dev->bar->cc); + + if (cc & NVME_CC_ENABLE) + writel(cc & ~NVME_CC_ENABLE, &dev->bar->cc); + return nvme_wait_ready(dev, cap, false); +} + +static int nvme_enable_ctrl(struct nvme_dev *dev, u64 cap) +{ + return nvme_wait_ready(dev, cap, true); +} + +static int nvme_configure_admin_queue(struct nvme_dev *dev) +{ + int result; + u32 aqa; + u64 cap = readq(&dev->bar->cap); + struct nvme_queue *nvmeq; + + dev->dbs = ((void __iomem *)dev->bar) + 4096; + dev->db_stride = NVME_CAP_STRIDE(cap); + + result = nvme_disable_ctrl(dev, cap); + if (result < 0) + return result; + + nvmeq = nvme_alloc_queue(dev, 0, 64, 0); + if (!nvmeq) + return -ENOMEM; + + aqa = nvmeq->q_depth - 1; + aqa |= aqa << 16; + + dev->ctrl_config = NVME_CC_ENABLE | NVME_CC_CSS_NVM; + dev->ctrl_config |= (PAGE_SHIFT - 12) << NVME_CC_MPS_SHIFT; + dev->ctrl_config |= NVME_CC_ARB_RR | NVME_CC_SHN_NONE; + dev->ctrl_config |= NVME_CC_IOSQES | NVME_CC_IOCQES; + + writel(aqa, &dev->bar->aqa); + writeq(nvmeq->sq_dma_addr, &dev->bar->asq); + writeq(nvmeq->cq_dma_addr, &dev->bar->acq); + writel(dev->ctrl_config, &dev->bar->cc); + + result = nvme_enable_ctrl(dev, cap); + if (result) + goto free_q; + + result = queue_request_irq(dev, nvmeq, "nvme admin"); + if (result) + goto free_q; + + dev->queues[0] = nvmeq; + return result; + + free_q: + nvme_free_queue_mem(nvmeq); + return result; +} + +struct nvme_iod *nvme_map_user_pages(struct nvme_dev *dev, int write, + unsigned long addr, unsigned length) +{ + int i, err, count, nents, offset; + struct scatterlist *sg; + struct page **pages; + struct nvme_iod *iod; + + if (addr & 3) + return ERR_PTR(-EINVAL); + if (!length) + return ERR_PTR(-EINVAL); + + offset = offset_in_page(addr); + count = DIV_ROUND_UP(offset + length, PAGE_SIZE); + pages = kcalloc(count, sizeof(*pages), GFP_KERNEL); + if (!pages) + return ERR_PTR(-ENOMEM); + + err = get_user_pages_fast(addr, count, 1, pages); + if (err < count) { + count = err; + err = -EFAULT; + goto put_pages; + } + + iod = nvme_alloc_iod(count, length, GFP_KERNEL); + sg = iod->sg; + sg_init_table(sg, count); + for (i = 0; i < count; i++) { + sg_set_page(&sg[i], pages[i], + min_t(int, length, PAGE_SIZE - offset), offset); + length -= (PAGE_SIZE - offset); + offset = 0; + } + sg_mark_end(&sg[i - 1]); + iod->nents = count; + + err = -ENOMEM; + nents = dma_map_sg(&dev->pci_dev->dev, sg, count, + write ? DMA_TO_DEVICE : DMA_FROM_DEVICE); + if (!nents) + goto free_iod; + + kfree(pages); + return iod; + + free_iod: + kfree(iod); + put_pages: + for (i = 0; i < count; i++) + put_page(pages[i]); + kfree(pages); + return ERR_PTR(err); +} + +void nvme_unmap_user_pages(struct nvme_dev *dev, int write, + struct nvme_iod *iod) +{ + int i; + + dma_unmap_sg(&dev->pci_dev->dev, iod->sg, iod->nents, + write ? DMA_TO_DEVICE : DMA_FROM_DEVICE); + + for (i = 0; i < iod->nents; i++) + put_page(sg_page(&iod->sg[i])); +} + +static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio) +{ + struct nvme_dev *dev = ns->dev; + struct nvme_queue *nvmeq; + struct nvme_user_io io; + struct nvme_command c; + unsigned length, meta_len; + int status, i; + struct nvme_iod *iod, *meta_iod = NULL; + dma_addr_t meta_dma_addr; + void *meta, *uninitialized_var(meta_mem); + + if (copy_from_user(&io, uio, sizeof(io))) + return -EFAULT; + length = (io.nblocks + 1) << ns->lba_shift; + meta_len = (io.nblocks + 1) * ns->ms; + + if (meta_len && ((io.metadata & 3) || !io.metadata)) + return -EINVAL; + + switch (io.opcode) { + case nvme_cmd_write: + case nvme_cmd_read: + case nvme_cmd_compare: + iod = nvme_map_user_pages(dev, io.opcode & 1, io.addr, length); + break; + default: + return -EINVAL; + } + + if (IS_ERR(iod)) + return PTR_ERR(iod); + + memset(&c, 0, sizeof(c)); + c.rw.opcode = io.opcode; + c.rw.flags = io.flags; + c.rw.nsid = cpu_to_le32(ns->ns_id); + c.rw.slba = cpu_to_le64(io.slba); + c.rw.length = cpu_to_le16(io.nblocks); + c.rw.control = cpu_to_le16(io.control); + c.rw.dsmgmt = cpu_to_le32(io.dsmgmt); + c.rw.reftag = cpu_to_le32(io.reftag); + c.rw.apptag = cpu_to_le16(io.apptag); + c.rw.appmask = cpu_to_le16(io.appmask); + + if (meta_len) { + meta_iod = nvme_map_user_pages(dev, io.opcode & 1, io.metadata, meta_len); + if (IS_ERR(meta_iod)) { + status = PTR_ERR(meta_iod); + meta_iod = NULL; + goto unmap; + } + + meta_mem = dma_alloc_coherent(&dev->pci_dev->dev, meta_len, + &meta_dma_addr, GFP_KERNEL); + if (!meta_mem) { + status = -ENOMEM; + goto unmap; + } + + if (io.opcode & 1) { + int meta_offset = 0; + + for (i = 0; i < meta_iod->nents; i++) { + meta = kmap_atomic(sg_page(&meta_iod->sg[i])) + + meta_iod->sg[i].offset; + memcpy(meta_mem + meta_offset, meta, + meta_iod->sg[i].length); + kunmap_atomic(meta); + meta_offset += meta_iod->sg[i].length; + } + } + + c.rw.metadata = cpu_to_le64(meta_dma_addr); + } + + length = nvme_setup_prps(dev, &c.common, iod, length, GFP_KERNEL); + + nvmeq = get_nvmeq(dev); + /* + * Since nvme_submit_sync_cmd sleeps, we can't keep preemption + * disabled. We may be preempted at any point, and be rescheduled + * to a different CPU. That will cause cacheline bouncing, but no + * additional races since q_lock already protects against other CPUs. + */ + put_nvmeq(nvmeq); + if (length != (io.nblocks + 1) << ns->lba_shift) + status = -ENOMEM; + else + status = nvme_submit_sync_cmd(nvmeq, &c, NULL, NVME_IO_TIMEOUT); + + if (meta_len) { + if (status == NVME_SC_SUCCESS && !(io.opcode & 1)) { + int meta_offset = 0; + + for (i = 0; i < meta_iod->nents; i++) { + meta = kmap_atomic(sg_page(&meta_iod->sg[i])) + + meta_iod->sg[i].offset; + memcpy(meta, meta_mem + meta_offset, + meta_iod->sg[i].length); + kunmap_atomic(meta); + meta_offset += meta_iod->sg[i].length; + } + } + + dma_free_coherent(&dev->pci_dev->dev, meta_len, meta_mem, + meta_dma_addr); + } + + unmap: + nvme_unmap_user_pages(dev, io.opcode & 1, iod); + nvme_free_iod(dev, iod); + + if (meta_iod) { + nvme_unmap_user_pages(dev, io.opcode & 1, meta_iod); + nvme_free_iod(dev, meta_iod); + } + + return status; +} + +static int nvme_user_admin_cmd(struct nvme_dev *dev, + struct nvme_admin_cmd __user *ucmd) +{ + struct nvme_admin_cmd cmd; + struct nvme_command c; + int status, length; + struct nvme_iod *uninitialized_var(iod); + unsigned timeout; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + if (copy_from_user(&cmd, ucmd, sizeof(cmd))) + return -EFAULT; + + memset(&c, 0, sizeof(c)); + c.common.opcode = cmd.opcode; + c.common.flags = cmd.flags; + c.common.nsid = cpu_to_le32(cmd.nsid); + c.common.cdw2[0] = cpu_to_le32(cmd.cdw2); + c.common.cdw2[1] = cpu_to_le32(cmd.cdw3); + c.common.cdw10[0] = cpu_to_le32(cmd.cdw10); + c.common.cdw10[1] = cpu_to_le32(cmd.cdw11); + c.common.cdw10[2] = cpu_to_le32(cmd.cdw12); + c.common.cdw10[3] = cpu_to_le32(cmd.cdw13); + c.common.cdw10[4] = cpu_to_le32(cmd.cdw14); + c.common.cdw10[5] = cpu_to_le32(cmd.cdw15); + + length = cmd.data_len; + if (cmd.data_len) { + iod = nvme_map_user_pages(dev, cmd.opcode & 1, cmd.addr, + length); + if (IS_ERR(iod)) + return PTR_ERR(iod); + length = nvme_setup_prps(dev, &c.common, iod, length, + GFP_KERNEL); + } + + timeout = cmd.timeout_ms ? msecs_to_jiffies(cmd.timeout_ms) : + ADMIN_TIMEOUT; + if (length != cmd.data_len) + status = -ENOMEM; + else + status = nvme_submit_sync_cmd(dev->queues[0], &c, &cmd.result, + timeout); + + if (cmd.data_len) { + nvme_unmap_user_pages(dev, cmd.opcode & 1, iod); + nvme_free_iod(dev, iod); + } + + if (!status && copy_to_user(&ucmd->result, &cmd.result, + sizeof(cmd.result))) + status = -EFAULT; + + return status; +} + +static int nvme_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd, + unsigned long arg) +{ + struct nvme_ns *ns = bdev->bd_disk->private_data; + + switch (cmd) { + case NVME_IOCTL_ID: + return ns->ns_id; + case NVME_IOCTL_ADMIN_CMD: + return nvme_user_admin_cmd(ns->dev, (void __user *)arg); + case NVME_IOCTL_SUBMIT_IO: + return nvme_submit_io(ns, (void __user *)arg); + case SG_GET_VERSION_NUM: + return nvme_sg_get_version_num((void __user *)arg); + case SG_IO: + return nvme_sg_io(ns, (void __user *)arg); + default: + return -ENOTTY; + } +} + +static const struct block_device_operations nvme_fops = { + .owner = THIS_MODULE, + .ioctl = nvme_ioctl, + .compat_ioctl = nvme_ioctl, +}; + +static void nvme_resubmit_bios(struct nvme_queue *nvmeq) +{ + while (bio_list_peek(&nvmeq->sq_cong)) { + struct bio *bio = bio_list_pop(&nvmeq->sq_cong); + struct nvme_ns *ns = bio->bi_bdev->bd_disk->private_data; + + if (bio_list_empty(&nvmeq->sq_cong)) + remove_wait_queue(&nvmeq->sq_full, + &nvmeq->sq_cong_wait); + if (nvme_submit_bio_queue(nvmeq, ns, bio)) { + if (bio_list_empty(&nvmeq->sq_cong)) + add_wait_queue(&nvmeq->sq_full, + &nvmeq->sq_cong_wait); + bio_list_add_head(&nvmeq->sq_cong, bio); + break; + } + } +} + +static int nvme_kthread(void *data) +{ + struct nvme_dev *dev; + + while (!kthread_should_stop()) { + set_current_state(TASK_INTERRUPTIBLE); + spin_lock(&dev_list_lock); + list_for_each_entry(dev, &dev_list, node) { + int i; + for (i = 0; i < dev->queue_count; i++) { + struct nvme_queue *nvmeq = dev->queues[i]; + if (!nvmeq) + continue; + spin_lock_irq(&nvmeq->q_lock); + if (nvme_process_cq(nvmeq)) + printk("process_cq did something\n"); + nvme_cancel_ios(nvmeq, true); + nvme_resubmit_bios(nvmeq); + spin_unlock_irq(&nvmeq->q_lock); + } + } + spin_unlock(&dev_list_lock); + schedule_timeout(round_jiffies_relative(HZ)); + } + return 0; +} + +static DEFINE_IDA(nvme_index_ida); + +static int nvme_get_ns_idx(void) +{ + int index, error; + + do { + if (!ida_pre_get(&nvme_index_ida, GFP_KERNEL)) + return -1; + + spin_lock(&dev_list_lock); + error = ida_get_new(&nvme_index_ida, &index); + spin_unlock(&dev_list_lock); + } while (error == -EAGAIN); + + if (error) + index = -1; + return index; +} + +static void nvme_put_ns_idx(int index) +{ + spin_lock(&dev_list_lock); + ida_remove(&nvme_index_ida, index); + spin_unlock(&dev_list_lock); +} + +static void nvme_config_discard(struct nvme_ns *ns) +{ + u32 logical_block_size = queue_logical_block_size(ns->queue); + ns->queue->limits.discard_zeroes_data = 0; + ns->queue->limits.discard_alignment = logical_block_size; + ns->queue->limits.discard_granularity = logical_block_size; + ns->queue->limits.max_discard_sectors = 0xffffffff; + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, ns->queue); +} + +static struct nvme_ns *nvme_alloc_ns(struct nvme_dev *dev, int nsid, + struct nvme_id_ns *id, struct nvme_lba_range_type *rt) +{ + struct nvme_ns *ns; + struct gendisk *disk; + int lbaf; + + if (rt->attributes & NVME_LBART_ATTRIB_HIDE) + return NULL; + + ns = kzalloc(sizeof(*ns), GFP_KERNEL); + if (!ns) + return NULL; + ns->queue = blk_alloc_queue(GFP_KERNEL); + if (!ns->queue) + goto out_free_ns; + ns->queue->queue_flags = QUEUE_FLAG_DEFAULT; + queue_flag_set_unlocked(QUEUE_FLAG_NOMERGES, ns->queue); + queue_flag_set_unlocked(QUEUE_FLAG_NONROT, ns->queue); + blk_queue_make_request(ns->queue, nvme_make_request); + ns->dev = dev; + ns->queue->queuedata = ns; + + disk = alloc_disk(NVME_MINORS); + if (!disk) + goto out_free_queue; + ns->ns_id = nsid; + ns->disk = disk; + lbaf = id->flbas & 0xf; + ns->lba_shift = id->lbaf[lbaf].ds; + ns->ms = le16_to_cpu(id->lbaf[lbaf].ms); + blk_queue_logical_block_size(ns->queue, 1 << ns->lba_shift); + if (dev->max_hw_sectors) + blk_queue_max_hw_sectors(ns->queue, dev->max_hw_sectors); + + disk->major = nvme_major; + disk->minors = NVME_MINORS; + disk->first_minor = NVME_MINORS * nvme_get_ns_idx(); + disk->fops = &nvme_fops; + disk->private_data = ns; + disk->queue = ns->queue; + disk->driverfs_dev = &dev->pci_dev->dev; + sprintf(disk->disk_name, "nvme%dn%d", dev->instance, nsid); + set_capacity(disk, le64_to_cpup(&id->nsze) << (ns->lba_shift - 9)); + + if (dev->oncs & NVME_CTRL_ONCS_DSM) + nvme_config_discard(ns); + + return ns; + + out_free_queue: + blk_cleanup_queue(ns->queue); + out_free_ns: + kfree(ns); + return NULL; +} + +static void nvme_ns_free(struct nvme_ns *ns) +{ + int index = ns->disk->first_minor / NVME_MINORS; + put_disk(ns->disk); + nvme_put_ns_idx(index); + blk_cleanup_queue(ns->queue); + kfree(ns); +} + +static int set_queue_count(struct nvme_dev *dev, int count) +{ + int status; + u32 result; + u32 q_count = (count - 1) | ((count - 1) << 16); + + status = nvme_set_features(dev, NVME_FEAT_NUM_QUEUES, q_count, 0, + &result); + if (status) + return -EIO; + return min(result & 0xffff, result >> 16) + 1; +} + +static int nvme_setup_io_queues(struct nvme_dev *dev) +{ + int result, cpu, i, nr_io_queues, db_bar_size, q_depth; + + nr_io_queues = num_online_cpus(); + result = set_queue_count(dev, nr_io_queues); + if (result < 0) + return result; + if (result < nr_io_queues) + nr_io_queues = result; + + /* Deregister the admin queue's interrupt */ + free_irq(dev->entry[0].vector, dev->queues[0]); + + db_bar_size = 4096 + ((nr_io_queues + 1) << (dev->db_stride + 3)); + if (db_bar_size > 8192) { + iounmap(dev->bar); + dev->bar = ioremap(pci_resource_start(dev->pci_dev, 0), + db_bar_size); + dev->dbs = ((void __iomem *)dev->bar) + 4096; + dev->queues[0]->q_db = dev->dbs; + } + + for (i = 0; i < nr_io_queues; i++) + dev->entry[i].entry = i; + for (;;) { + result = pci_enable_msix(dev->pci_dev, dev->entry, + nr_io_queues); + if (result == 0) { + break; + } else if (result > 0) { + nr_io_queues = result; + continue; + } else { + nr_io_queues = 1; + break; + } + } + + result = queue_request_irq(dev, dev->queues[0], "nvme admin"); + /* XXX: handle failure here */ + + cpu = cpumask_first(cpu_online_mask); + for (i = 0; i < nr_io_queues; i++) { + irq_set_affinity_hint(dev->entry[i].vector, get_cpu_mask(cpu)); + cpu = cpumask_next(cpu, cpu_online_mask); + } + + q_depth = min_t(int, NVME_CAP_MQES(readq(&dev->bar->cap)) + 1, + NVME_Q_DEPTH); + for (i = 0; i < nr_io_queues; i++) { + dev->queues[i + 1] = nvme_create_queue(dev, i + 1, q_depth, i); + if (IS_ERR(dev->queues[i + 1])) + return PTR_ERR(dev->queues[i + 1]); + dev->queue_count++; + } + + for (; i < num_possible_cpus(); i++) { + int target = i % rounddown_pow_of_two(dev->queue_count - 1); + dev->queues[i + 1] = dev->queues[target + 1]; + } + + return 0; +} + +static void nvme_free_queues(struct nvme_dev *dev) +{ + int i; + + for (i = dev->queue_count - 1; i >= 0; i--) + nvme_free_queue(dev, i); +} + +/* + * Return: error value if an error occurred setting up the queues or calling + * Identify Device. 0 if these succeeded, even if adding some of the + * namespaces failed. At the moment, these failures are silent. TBD which + * failures should be reported. + */ +static int nvme_dev_add(struct nvme_dev *dev) +{ + int res, nn, i; + struct nvme_ns *ns; + struct nvme_id_ctrl *ctrl; + struct nvme_id_ns *id_ns; + void *mem; + dma_addr_t dma_addr; + int shift = NVME_CAP_MPSMIN(readq(&dev->bar->cap)) + 12; + + res = nvme_setup_io_queues(dev); + if (res) + return res; + + mem = dma_alloc_coherent(&dev->pci_dev->dev, 8192, &dma_addr, + GFP_KERNEL); + if (!mem) + return -ENOMEM; + + res = nvme_identify(dev, 0, 1, dma_addr); + if (res) { + res = -EIO; + goto out; + } + + ctrl = mem; + nn = le32_to_cpup(&ctrl->nn); + dev->oncs = le16_to_cpup(&ctrl->oncs); + memcpy(dev->serial, ctrl->sn, sizeof(ctrl->sn)); + memcpy(dev->model, ctrl->mn, sizeof(ctrl->mn)); + memcpy(dev->firmware_rev, ctrl->fr, sizeof(ctrl->fr)); + if (ctrl->mdts) + dev->max_hw_sectors = 1 << (ctrl->mdts + shift - 9); + if ((dev->pci_dev->vendor == PCI_VENDOR_ID_INTEL) && + (dev->pci_dev->device == 0x0953) && ctrl->vs[3]) + dev->stripe_size = 1 << (ctrl->vs[3] + shift); + + id_ns = mem; + for (i = 1; i <= nn; i++) { + res = nvme_identify(dev, i, 0, dma_addr); + if (res) + continue; + + if (id_ns->ncap == 0) + continue; + + res = nvme_get_features(dev, NVME_FEAT_LBA_RANGE, i, + dma_addr + 4096, NULL); + if (res) + memset(mem + 4096, 0, 4096); + + ns = nvme_alloc_ns(dev, i, mem, mem + 4096); + if (ns) + list_add_tail(&ns->list, &dev->namespaces); + } + list_for_each_entry(ns, &dev->namespaces, list) + add_disk(ns->disk); + res = 0; + + out: + dma_free_coherent(&dev->pci_dev->dev, 8192, mem, dma_addr); + return res; +} + +static int nvme_dev_remove(struct nvme_dev *dev) +{ + struct nvme_ns *ns, *next; + + spin_lock(&dev_list_lock); + list_del(&dev->node); + spin_unlock(&dev_list_lock); + + list_for_each_entry_safe(ns, next, &dev->namespaces, list) { + list_del(&ns->list); + del_gendisk(ns->disk); + nvme_ns_free(ns); + } + + nvme_free_queues(dev); + + return 0; +} + +static int nvme_setup_prp_pools(struct nvme_dev *dev) +{ + struct device *dmadev = &dev->pci_dev->dev; + dev->prp_page_pool = dma_pool_create("prp list page", dmadev, + PAGE_SIZE, PAGE_SIZE, 0); + if (!dev->prp_page_pool) + return -ENOMEM; + + /* Optimisation for I/Os between 4k and 128k */ + dev->prp_small_pool = dma_pool_create("prp list 256", dmadev, + 256, 256, 0); + if (!dev->prp_small_pool) { + dma_pool_destroy(dev->prp_page_pool); + return -ENOMEM; + } + return 0; +} + +static void nvme_release_prp_pools(struct nvme_dev *dev) +{ + dma_pool_destroy(dev->prp_page_pool); + dma_pool_destroy(dev->prp_small_pool); +} + +static DEFINE_IDA(nvme_instance_ida); + +static int nvme_set_instance(struct nvme_dev *dev) +{ + int instance, error; + + do { + if (!ida_pre_get(&nvme_instance_ida, GFP_KERNEL)) + return -ENODEV; + + spin_lock(&dev_list_lock); + error = ida_get_new(&nvme_instance_ida, &instance); + spin_unlock(&dev_list_lock); + } while (error == -EAGAIN); + + if (error) + return -ENODEV; + + dev->instance = instance; + return 0; +} + +static void nvme_release_instance(struct nvme_dev *dev) +{ + spin_lock(&dev_list_lock); + ida_remove(&nvme_instance_ida, dev->instance); + spin_unlock(&dev_list_lock); +} + +static void nvme_free_dev(struct kref *kref) +{ + struct nvme_dev *dev = container_of(kref, struct nvme_dev, kref); + nvme_dev_remove(dev); + pci_disable_msix(dev->pci_dev); + iounmap(dev->bar); + nvme_release_instance(dev); + nvme_release_prp_pools(dev); + pci_disable_device(dev->pci_dev); + pci_release_regions(dev->pci_dev); + kfree(dev->queues); + kfree(dev->entry); + kfree(dev); +} + +static int nvme_dev_open(struct inode *inode, struct file *f) +{ + struct nvme_dev *dev = container_of(f->private_data, struct nvme_dev, + miscdev); + kref_get(&dev->kref); + f->private_data = dev; + return 0; +} + +static int nvme_dev_release(struct inode *inode, struct file *f) +{ + struct nvme_dev *dev = f->private_data; + kref_put(&dev->kref, nvme_free_dev); + return 0; +} + +static long nvme_dev_ioctl(struct file *f, unsigned int cmd, unsigned long arg) +{ + struct nvme_dev *dev = f->private_data; + switch (cmd) { + case NVME_IOCTL_ADMIN_CMD: + return nvme_user_admin_cmd(dev, (void __user *)arg); + default: + return -ENOTTY; + } +} + +static const struct file_operations nvme_dev_fops = { + .owner = THIS_MODULE, + .open = nvme_dev_open, + .release = nvme_dev_release, + .unlocked_ioctl = nvme_dev_ioctl, + .compat_ioctl = nvme_dev_ioctl, +}; + +static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) +{ + int bars, result = -ENOMEM; + struct nvme_dev *dev; + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) + return -ENOMEM; + dev->entry = kcalloc(num_possible_cpus(), sizeof(*dev->entry), + GFP_KERNEL); + if (!dev->entry) + goto free; + dev->queues = kcalloc(num_possible_cpus() + 1, sizeof(void *), + GFP_KERNEL); + if (!dev->queues) + goto free; + + if (pci_enable_device_mem(pdev)) + goto free; + pci_set_master(pdev); + bars = pci_select_bars(pdev, IORESOURCE_MEM); + if (pci_request_selected_regions(pdev, bars, "nvme")) + goto disable; + + INIT_LIST_HEAD(&dev->namespaces); + dev->pci_dev = pdev; + pci_set_drvdata(pdev, dev); + dma_set_mask(&pdev->dev, DMA_BIT_MASK(64)); + dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64)); + result = nvme_set_instance(dev); + if (result) + goto disable; + + dev->entry[0].vector = pdev->irq; + + result = nvme_setup_prp_pools(dev); + if (result) + goto disable_msix; + + dev->bar = ioremap(pci_resource_start(pdev, 0), 8192); + if (!dev->bar) { + result = -ENOMEM; + goto disable_msix; + } + + result = nvme_configure_admin_queue(dev); + if (result) + goto unmap; + dev->queue_count++; + + spin_lock(&dev_list_lock); + list_add(&dev->node, &dev_list); + spin_unlock(&dev_list_lock); + + result = nvme_dev_add(dev); + if (result) + goto delete; + + scnprintf(dev->name, sizeof(dev->name), "nvme%d", dev->instance); + dev->miscdev.minor = MISC_DYNAMIC_MINOR; + dev->miscdev.parent = &pdev->dev; + dev->miscdev.name = dev->name; + dev->miscdev.fops = &nvme_dev_fops; + result = misc_register(&dev->miscdev); + if (result) + goto remove; + + kref_init(&dev->kref); + return 0; + + remove: + nvme_dev_remove(dev); + delete: + spin_lock(&dev_list_lock); + list_del(&dev->node); + spin_unlock(&dev_list_lock); + + nvme_free_queues(dev); + unmap: + iounmap(dev->bar); + disable_msix: + pci_disable_msix(pdev); + nvme_release_instance(dev); + nvme_release_prp_pools(dev); + disable: + pci_disable_device(pdev); + pci_release_regions(pdev); + free: + kfree(dev->queues); + kfree(dev->entry); + kfree(dev); + return result; +} + +static void nvme_remove(struct pci_dev *pdev) +{ + struct nvme_dev *dev = pci_get_drvdata(pdev); + misc_deregister(&dev->miscdev); + kref_put(&dev->kref, nvme_free_dev); +} + +/* These functions are yet to be implemented */ +#define nvme_error_detected NULL +#define nvme_dump_registers NULL +#define nvme_link_reset NULL +#define nvme_slot_reset NULL +#define nvme_error_resume NULL +#define nvme_suspend NULL +#define nvme_resume NULL + +static const struct pci_error_handlers nvme_err_handler = { + .error_detected = nvme_error_detected, + .mmio_enabled = nvme_dump_registers, + .link_reset = nvme_link_reset, + .slot_reset = nvme_slot_reset, + .resume = nvme_error_resume, +}; + +/* Move to pci_ids.h later */ +#define PCI_CLASS_STORAGE_EXPRESS 0x010802 + +static DEFINE_PCI_DEVICE_TABLE(nvme_id_table) = { + { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) }, + { 0, } +}; +MODULE_DEVICE_TABLE(pci, nvme_id_table); + +static struct pci_driver nvme_driver = { + .name = "nvme", + .id_table = nvme_id_table, + .probe = nvme_probe, + .remove = nvme_remove, + .suspend = nvme_suspend, + .resume = nvme_resume, + .err_handler = &nvme_err_handler, +}; + +static int __init nvme_init(void) +{ + int result; + + nvme_thread = kthread_run(nvme_kthread, NULL, "nvme"); + if (IS_ERR(nvme_thread)) + return PTR_ERR(nvme_thread); + + result = register_blkdev(nvme_major, "nvme"); + if (result < 0) + goto kill_kthread; + else if (result > 0) + nvme_major = result; + + result = pci_register_driver(&nvme_driver); + if (result) + goto unregister_blkdev; + return 0; + + unregister_blkdev: + unregister_blkdev(nvme_major, "nvme"); + kill_kthread: + kthread_stop(nvme_thread); + return result; +} + +static void __exit nvme_exit(void) +{ + pci_unregister_driver(&nvme_driver); + unregister_blkdev(nvme_major, "nvme"); + kthread_stop(nvme_thread); +} + +MODULE_AUTHOR("Matthew Wilcox "); +MODULE_LICENSE("GPL"); +MODULE_VERSION("0.8"); +module_init(nvme_init); +module_exit(nvme_exit); diff --git a/drivers/block/nvme-scsi.c b/drivers/block/nvme-scsi.c new file mode 100644 index 000000000000..fed54b039893 --- /dev/null +++ b/drivers/block/nvme-scsi.c @@ -0,0 +1,3053 @@ +/* + * NVM Express device driver + * Copyright (c) 2011, Intel Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + */ + +/* + * Refer to the SCSI-NVMe Translation spec for details on how + * each command is translated. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +static int sg_version_num = 30534; /* 2 digits for each component */ + +#define SNTI_TRANSLATION_SUCCESS 0 +#define SNTI_INTERNAL_ERROR 1 + +/* VPD Page Codes */ +#define VPD_SUPPORTED_PAGES 0x00 +#define VPD_SERIAL_NUMBER 0x80 +#define VPD_DEVICE_IDENTIFIERS 0x83 +#define VPD_EXTENDED_INQUIRY 0x86 +#define VPD_BLOCK_DEV_CHARACTERISTICS 0xB1 + +/* CDB offsets */ +#define REPORT_LUNS_CDB_ALLOC_LENGTH_OFFSET 6 +#define REPORT_LUNS_SR_OFFSET 2 +#define READ_CAP_16_CDB_ALLOC_LENGTH_OFFSET 10 +#define REQUEST_SENSE_CDB_ALLOC_LENGTH_OFFSET 4 +#define REQUEST_SENSE_DESC_OFFSET 1 +#define REQUEST_SENSE_DESC_MASK 0x01 +#define DESCRIPTOR_FORMAT_SENSE_DATA_TYPE 1 +#define INQUIRY_EVPD_BYTE_OFFSET 1 +#define INQUIRY_PAGE_CODE_BYTE_OFFSET 2 +#define INQUIRY_EVPD_BIT_MASK 1 +#define INQUIRY_CDB_ALLOCATION_LENGTH_OFFSET 3 +#define START_STOP_UNIT_CDB_IMMED_OFFSET 1 +#define START_STOP_UNIT_CDB_IMMED_MASK 0x1 +#define START_STOP_UNIT_CDB_POWER_COND_MOD_OFFSET 3 +#define START_STOP_UNIT_CDB_POWER_COND_MOD_MASK 0xF +#define START_STOP_UNIT_CDB_POWER_COND_OFFSET 4 +#define START_STOP_UNIT_CDB_POWER_COND_MASK 0xF0 +#define START_STOP_UNIT_CDB_NO_FLUSH_OFFSET 4 +#define START_STOP_UNIT_CDB_NO_FLUSH_MASK 0x4 +#define START_STOP_UNIT_CDB_START_OFFSET 4 +#define START_STOP_UNIT_CDB_START_MASK 0x1 +#define WRITE_BUFFER_CDB_MODE_OFFSET 1 +#define WRITE_BUFFER_CDB_MODE_MASK 0x1F +#define WRITE_BUFFER_CDB_BUFFER_ID_OFFSET 2 +#define WRITE_BUFFER_CDB_BUFFER_OFFSET_OFFSET 3 +#define WRITE_BUFFER_CDB_PARM_LIST_LENGTH_OFFSET 6 +#define FORMAT_UNIT_CDB_FORMAT_PROT_INFO_OFFSET 1 +#define FORMAT_UNIT_CDB_FORMAT_PROT_INFO_MASK 0xC0 +#define FORMAT_UNIT_CDB_FORMAT_PROT_INFO_SHIFT 6 +#define FORMAT_UNIT_CDB_LONG_LIST_OFFSET 1 +#define FORMAT_UNIT_CDB_LONG_LIST_MASK 0x20 +#define FORMAT_UNIT_CDB_FORMAT_DATA_OFFSET 1 +#define FORMAT_UNIT_CDB_FORMAT_DATA_MASK 0x10 +#define FORMAT_UNIT_SHORT_PARM_LIST_LEN 4 +#define FORMAT_UNIT_LONG_PARM_LIST_LEN 8 +#define FORMAT_UNIT_PROT_INT_OFFSET 3 +#define FORMAT_UNIT_PROT_FIELD_USAGE_OFFSET 0 +#define FORMAT_UNIT_PROT_FIELD_USAGE_MASK 0x07 +#define UNMAP_CDB_PARAM_LIST_LENGTH_OFFSET 7 + +/* Misc. defines */ +#define NIBBLE_SHIFT 4 +#define FIXED_SENSE_DATA 0x70 +#define DESC_FORMAT_SENSE_DATA 0x72 +#define FIXED_SENSE_DATA_ADD_LENGTH 10 +#define LUN_ENTRY_SIZE 8 +#define LUN_DATA_HEADER_SIZE 8 +#define ALL_LUNS_RETURNED 0x02 +#define ALL_WELL_KNOWN_LUNS_RETURNED 0x01 +#define RESTRICTED_LUNS_RETURNED 0x00 +#define NVME_POWER_STATE_START_VALID 0x00 +#define NVME_POWER_STATE_ACTIVE 0x01 +#define NVME_POWER_STATE_IDLE 0x02 +#define NVME_POWER_STATE_STANDBY 0x03 +#define NVME_POWER_STATE_LU_CONTROL 0x07 +#define POWER_STATE_0 0 +#define POWER_STATE_1 1 +#define POWER_STATE_2 2 +#define POWER_STATE_3 3 +#define DOWNLOAD_SAVE_ACTIVATE 0x05 +#define DOWNLOAD_SAVE_DEFER_ACTIVATE 0x0E +#define ACTIVATE_DEFERRED_MICROCODE 0x0F +#define FORMAT_UNIT_IMMED_MASK 0x2 +#define FORMAT_UNIT_IMMED_OFFSET 1 +#define KELVIN_TEMP_FACTOR 273 +#define FIXED_FMT_SENSE_DATA_SIZE 18 +#define DESC_FMT_SENSE_DATA_SIZE 8 + +/* SCSI/NVMe defines and bit masks */ +#define INQ_STANDARD_INQUIRY_PAGE 0x00 +#define INQ_SUPPORTED_VPD_PAGES_PAGE 0x00 +#define INQ_UNIT_SERIAL_NUMBER_PAGE 0x80 +#define INQ_DEVICE_IDENTIFICATION_PAGE 0x83 +#define INQ_EXTENDED_INQUIRY_DATA_PAGE 0x86 +#define INQ_BDEV_CHARACTERISTICS_PAGE 0xB1 +#define INQ_SERIAL_NUMBER_LENGTH 0x14 +#define INQ_NUM_SUPPORTED_VPD_PAGES 5 +#define VERSION_SPC_4 0x06 +#define ACA_UNSUPPORTED 0 +#define STANDARD_INQUIRY_LENGTH 36 +#define ADDITIONAL_STD_INQ_LENGTH 31 +#define EXTENDED_INQUIRY_DATA_PAGE_LENGTH 0x3C +#define RESERVED_FIELD 0 + +/* SCSI READ/WRITE Defines */ +#define IO_CDB_WP_MASK 0xE0 +#define IO_CDB_WP_SHIFT 5 +#define IO_CDB_FUA_MASK 0x8 +#define IO_6_CDB_LBA_OFFSET 0 +#define IO_6_CDB_LBA_MASK 0x001FFFFF +#define IO_6_CDB_TX_LEN_OFFSET 4 +#define IO_6_DEFAULT_TX_LEN 256 +#define IO_10_CDB_LBA_OFFSET 2 +#define IO_10_CDB_TX_LEN_OFFSET 7 +#define IO_10_CDB_WP_OFFSET 1 +#define IO_10_CDB_FUA_OFFSET 1 +#define IO_12_CDB_LBA_OFFSET 2 +#define IO_12_CDB_TX_LEN_OFFSET 6 +#define IO_12_CDB_WP_OFFSET 1 +#define IO_12_CDB_FUA_OFFSET 1 +#define IO_16_CDB_FUA_OFFSET 1 +#define IO_16_CDB_WP_OFFSET 1 +#define IO_16_CDB_LBA_OFFSET 2 +#define IO_16_CDB_TX_LEN_OFFSET 10 + +/* Mode Sense/Select defines */ +#define MODE_PAGE_INFO_EXCEP 0x1C +#define MODE_PAGE_CACHING 0x08 +#define MODE_PAGE_CONTROL 0x0A +#define MODE_PAGE_POWER_CONDITION 0x1A +#define MODE_PAGE_RETURN_ALL 0x3F +#define MODE_PAGE_BLK_DES_LEN 0x08 +#define MODE_PAGE_LLBAA_BLK_DES_LEN 0x10 +#define MODE_PAGE_CACHING_LEN 0x14 +#define MODE_PAGE_CONTROL_LEN 0x0C +#define MODE_PAGE_POW_CND_LEN 0x28 +#define MODE_PAGE_INF_EXC_LEN 0x0C +#define MODE_PAGE_ALL_LEN 0x54 +#define MODE_SENSE6_MPH_SIZE 4 +#define MODE_SENSE6_ALLOC_LEN_OFFSET 4 +#define MODE_SENSE_PAGE_CONTROL_OFFSET 2 +#define MODE_SENSE_PAGE_CONTROL_MASK 0xC0 +#define MODE_SENSE_PAGE_CODE_OFFSET 2 +#define MODE_SENSE_PAGE_CODE_MASK 0x3F +#define MODE_SENSE_LLBAA_OFFSET 1 +#define MODE_SENSE_LLBAA_MASK 0x10 +#define MODE_SENSE_LLBAA_SHIFT 4 +#define MODE_SENSE_DBD_OFFSET 1 +#define MODE_SENSE_DBD_MASK 8 +#define MODE_SENSE_DBD_SHIFT 3 +#define MODE_SENSE10_MPH_SIZE 8 +#define MODE_SENSE10_ALLOC_LEN_OFFSET 7 +#define MODE_SELECT_CDB_PAGE_FORMAT_OFFSET 1 +#define MODE_SELECT_CDB_SAVE_PAGES_OFFSET 1 +#define MODE_SELECT_6_CDB_PARAM_LIST_LENGTH_OFFSET 4 +#define MODE_SELECT_10_CDB_PARAM_LIST_LENGTH_OFFSET 7 +#define MODE_SELECT_CDB_PAGE_FORMAT_MASK 0x10 +#define MODE_SELECT_CDB_SAVE_PAGES_MASK 0x1 +#define MODE_SELECT_6_BD_OFFSET 3 +#define MODE_SELECT_10_BD_OFFSET 6 +#define MODE_SELECT_10_LLBAA_OFFSET 4 +#define MODE_SELECT_10_LLBAA_MASK 1 +#define MODE_SELECT_6_MPH_SIZE 4 +#define MODE_SELECT_10_MPH_SIZE 8 +#define CACHING_MODE_PAGE_WCE_MASK 0x04 +#define MODE_SENSE_BLK_DESC_ENABLED 0 +#define MODE_SENSE_BLK_DESC_COUNT 1 +#define MODE_SELECT_PAGE_CODE_MASK 0x3F +#define SHORT_DESC_BLOCK 8 +#define LONG_DESC_BLOCK 16 +#define MODE_PAGE_POW_CND_LEN_FIELD 0x26 +#define MODE_PAGE_INF_EXC_LEN_FIELD 0x0A +#define MODE_PAGE_CACHING_LEN_FIELD 0x12 +#define MODE_PAGE_CONTROL_LEN_FIELD 0x0A +#define MODE_SENSE_PC_CURRENT_VALUES 0 + +/* Log Sense defines */ +#define LOG_PAGE_SUPPORTED_LOG_PAGES_PAGE 0x00 +#define LOG_PAGE_SUPPORTED_LOG_PAGES_LENGTH 0x07 +#define LOG_PAGE_INFORMATIONAL_EXCEPTIONS_PAGE 0x2F +#define LOG_PAGE_TEMPERATURE_PAGE 0x0D +#define LOG_SENSE_CDB_SP_OFFSET 1 +#define LOG_SENSE_CDB_SP_NOT_ENABLED 0 +#define LOG_SENSE_CDB_PC_OFFSET 2 +#define LOG_SENSE_CDB_PC_MASK 0xC0 +#define LOG_SENSE_CDB_PC_SHIFT 6 +#define LOG_SENSE_CDB_PC_CUMULATIVE_VALUES 1 +#define LOG_SENSE_CDB_PAGE_CODE_MASK 0x3F +#define LOG_SENSE_CDB_ALLOC_LENGTH_OFFSET 7 +#define REMAINING_INFO_EXCP_PAGE_LENGTH 0x8 +#define LOG_INFO_EXCP_PAGE_LENGTH 0xC +#define REMAINING_TEMP_PAGE_LENGTH 0xC +#define LOG_TEMP_PAGE_LENGTH 0x10 +#define LOG_TEMP_UNKNOWN 0xFF +#define SUPPORTED_LOG_PAGES_PAGE_LENGTH 0x3 + +/* Read Capacity defines */ +#define READ_CAP_10_RESP_SIZE 8 +#define READ_CAP_16_RESP_SIZE 32 + +/* NVMe Namespace and Command Defines */ +#define NVME_GET_SMART_LOG_PAGE 0x02 +#define NVME_GET_FEAT_TEMP_THRESH 0x04 +#define BYTES_TO_DWORDS 4 +#define NVME_MAX_FIRMWARE_SLOT 7 + +/* Report LUNs defines */ +#define REPORT_LUNS_FIRST_LUN_OFFSET 8 + +/* SCSI ADDITIONAL SENSE Codes */ + +#define SCSI_ASC_NO_SENSE 0x00 +#define SCSI_ASC_PERIPHERAL_DEV_WRITE_FAULT 0x03 +#define SCSI_ASC_LUN_NOT_READY 0x04 +#define SCSI_ASC_WARNING 0x0B +#define SCSI_ASC_LOG_BLOCK_GUARD_CHECK_FAILED 0x10 +#define SCSI_ASC_LOG_BLOCK_APPTAG_CHECK_FAILED 0x10 +#define SCSI_ASC_LOG_BLOCK_REFTAG_CHECK_FAILED 0x10 +#define SCSI_ASC_UNRECOVERED_READ_ERROR 0x11 +#define SCSI_ASC_MISCOMPARE_DURING_VERIFY 0x1D +#define SCSI_ASC_ACCESS_DENIED_INVALID_LUN_ID 0x20 +#define SCSI_ASC_ILLEGAL_COMMAND 0x20 +#define SCSI_ASC_ILLEGAL_BLOCK 0x21 +#define SCSI_ASC_INVALID_CDB 0x24 +#define SCSI_ASC_INVALID_LUN 0x25 +#define SCSI_ASC_INVALID_PARAMETER 0x26 +#define SCSI_ASC_FORMAT_COMMAND_FAILED 0x31 +#define SCSI_ASC_INTERNAL_TARGET_FAILURE 0x44 + +/* SCSI ADDITIONAL SENSE Code Qualifiers */ + +#define SCSI_ASCQ_CAUSE_NOT_REPORTABLE 0x00 +#define SCSI_ASCQ_FORMAT_COMMAND_FAILED 0x01 +#define SCSI_ASCQ_LOG_BLOCK_GUARD_CHECK_FAILED 0x01 +#define SCSI_ASCQ_LOG_BLOCK_APPTAG_CHECK_FAILED 0x02 +#define SCSI_ASCQ_LOG_BLOCK_REFTAG_CHECK_FAILED 0x03 +#define SCSI_ASCQ_FORMAT_IN_PROGRESS 0x04 +#define SCSI_ASCQ_POWER_LOSS_EXPECTED 0x08 +#define SCSI_ASCQ_INVALID_LUN_ID 0x09 + +/** + * DEVICE_SPECIFIC_PARAMETER in mode parameter header (see sbc2r16) to + * enable DPOFUA support type 0x10 value. + */ +#define DEVICE_SPECIFIC_PARAMETER 0 +#define VPD_ID_DESCRIPTOR_LENGTH sizeof(VPD_IDENTIFICATION_DESCRIPTOR) + +/* MACROs to extract information from CDBs */ + +#define GET_OPCODE(cdb) cdb[0] + +#define GET_U8_FROM_CDB(cdb, index) (cdb[index] << 0) + +#define GET_U16_FROM_CDB(cdb, index) ((cdb[index] << 8) | (cdb[index + 1] << 0)) + +#define GET_U24_FROM_CDB(cdb, index) ((cdb[index] << 16) | \ +(cdb[index + 1] << 8) | \ +(cdb[index + 2] << 0)) + +#define GET_U32_FROM_CDB(cdb, index) ((cdb[index] << 24) | \ +(cdb[index + 1] << 16) | \ +(cdb[index + 2] << 8) | \ +(cdb[index + 3] << 0)) + +#define GET_U64_FROM_CDB(cdb, index) ((((u64)cdb[index]) << 56) | \ +(((u64)cdb[index + 1]) << 48) | \ +(((u64)cdb[index + 2]) << 40) | \ +(((u64)cdb[index + 3]) << 32) | \ +(((u64)cdb[index + 4]) << 24) | \ +(((u64)cdb[index + 5]) << 16) | \ +(((u64)cdb[index + 6]) << 8) | \ +(((u64)cdb[index + 7]) << 0)) + +/* Inquiry Helper Macros */ +#define GET_INQ_EVPD_BIT(cdb) \ +((GET_U8_FROM_CDB(cdb, INQUIRY_EVPD_BYTE_OFFSET) & \ +INQUIRY_EVPD_BIT_MASK) ? 1 : 0) + +#define GET_INQ_PAGE_CODE(cdb) \ +(GET_U8_FROM_CDB(cdb, INQUIRY_PAGE_CODE_BYTE_OFFSET)) + +#define GET_INQ_ALLOC_LENGTH(cdb) \ +(GET_U16_FROM_CDB(cdb, INQUIRY_CDB_ALLOCATION_LENGTH_OFFSET)) + +/* Report LUNs Helper Macros */ +#define GET_REPORT_LUNS_ALLOC_LENGTH(cdb) \ +(GET_U32_FROM_CDB(cdb, REPORT_LUNS_CDB_ALLOC_LENGTH_OFFSET)) + +/* Read Capacity Helper Macros */ +#define GET_READ_CAP_16_ALLOC_LENGTH(cdb) \ +(GET_U32_FROM_CDB(cdb, READ_CAP_16_CDB_ALLOC_LENGTH_OFFSET)) + +#define IS_READ_CAP_16(cdb) \ +((cdb[0] == SERVICE_ACTION_IN && cdb[1] == SAI_READ_CAPACITY_16) ? 1 : 0) + +/* Request Sense Helper Macros */ +#define GET_REQUEST_SENSE_ALLOC_LENGTH(cdb) \ +(GET_U8_FROM_CDB(cdb, REQUEST_SENSE_CDB_ALLOC_LENGTH_OFFSET)) + +/* Mode Sense Helper Macros */ +#define GET_MODE_SENSE_DBD(cdb) \ +((GET_U8_FROM_CDB(cdb, MODE_SENSE_DBD_OFFSET) & MODE_SENSE_DBD_MASK) >> \ +MODE_SENSE_DBD_SHIFT) + +#define GET_MODE_SENSE_LLBAA(cdb) \ +((GET_U8_FROM_CDB(cdb, MODE_SENSE_LLBAA_OFFSET) & \ +MODE_SENSE_LLBAA_MASK) >> MODE_SENSE_LLBAA_SHIFT) + +#define GET_MODE_SENSE_MPH_SIZE(cdb10) \ +(cdb10 ? MODE_SENSE10_MPH_SIZE : MODE_SENSE6_MPH_SIZE) + + +/* Struct to gather data that needs to be extracted from a SCSI CDB. + Not conforming to any particular CDB variant, but compatible with all. */ + +struct nvme_trans_io_cdb { + u8 fua; + u8 prot_info; + u64 lba; + u32 xfer_len; +}; + + +/* Internal Helper Functions */ + + +/* Copy data to userspace memory */ + +static int nvme_trans_copy_to_user(struct sg_io_hdr *hdr, void *from, + unsigned long n) +{ + int res = SNTI_TRANSLATION_SUCCESS; + unsigned long not_copied; + int i; + void *index = from; + size_t remaining = n; + size_t xfer_len; + + if (hdr->iovec_count > 0) { + struct sg_iovec sgl; + + for (i = 0; i < hdr->iovec_count; i++) { + not_copied = copy_from_user(&sgl, hdr->dxferp + + i * sizeof(struct sg_iovec), + sizeof(struct sg_iovec)); + if (not_copied) + return -EFAULT; + xfer_len = min(remaining, sgl.iov_len); + not_copied = copy_to_user(sgl.iov_base, index, + xfer_len); + if (not_copied) { + res = -EFAULT; + break; + } + index += xfer_len; + remaining -= xfer_len; + if (remaining == 0) + break; + } + return res; + } + not_copied = copy_to_user(hdr->dxferp, from, n); + if (not_copied) + res = -EFAULT; + return res; +} + +/* Copy data from userspace memory */ + +static int nvme_trans_copy_from_user(struct sg_io_hdr *hdr, void *to, + unsigned long n) +{ + int res = SNTI_TRANSLATION_SUCCESS; + unsigned long not_copied; + int i; + void *index = to; + size_t remaining = n; + size_t xfer_len; + + if (hdr->iovec_count > 0) { + struct sg_iovec sgl; + + for (i = 0; i < hdr->iovec_count; i++) { + not_copied = copy_from_user(&sgl, hdr->dxferp + + i * sizeof(struct sg_iovec), + sizeof(struct sg_iovec)); + if (not_copied) + return -EFAULT; + xfer_len = min(remaining, sgl.iov_len); + not_copied = copy_from_user(index, sgl.iov_base, + xfer_len); + if (not_copied) { + res = -EFAULT; + break; + } + index += xfer_len; + remaining -= xfer_len; + if (remaining == 0) + break; + } + return res; + } + + not_copied = copy_from_user(to, hdr->dxferp, n); + if (not_copied) + res = -EFAULT; + return res; +} + +/* Status/Sense Buffer Writeback */ + +static int nvme_trans_completion(struct sg_io_hdr *hdr, u8 status, u8 sense_key, + u8 asc, u8 ascq) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u8 xfer_len; + u8 resp[DESC_FMT_SENSE_DATA_SIZE]; + + if (scsi_status_is_good(status)) { + hdr->status = SAM_STAT_GOOD; + hdr->masked_status = GOOD; + hdr->host_status = DID_OK; + hdr->driver_status = DRIVER_OK; + hdr->sb_len_wr = 0; + } else { + hdr->status = status; + hdr->masked_status = status >> 1; + hdr->host_status = DID_OK; + hdr->driver_status = DRIVER_OK; + + memset(resp, 0, DESC_FMT_SENSE_DATA_SIZE); + resp[0] = DESC_FORMAT_SENSE_DATA; + resp[1] = sense_key; + resp[2] = asc; + resp[3] = ascq; + + xfer_len = min_t(u8, hdr->mx_sb_len, DESC_FMT_SENSE_DATA_SIZE); + hdr->sb_len_wr = xfer_len; + if (copy_to_user(hdr->sbp, resp, xfer_len) > 0) + res = -EFAULT; + } + + return res; +} + +static int nvme_trans_status_code(struct sg_io_hdr *hdr, int nvme_sc) +{ + u8 status, sense_key, asc, ascq; + int res = SNTI_TRANSLATION_SUCCESS; + + /* For non-nvme (Linux) errors, simply return the error code */ + if (nvme_sc < 0) + return nvme_sc; + + /* Mask DNR, More, and reserved fields */ + nvme_sc &= 0x7FF; + + switch (nvme_sc) { + /* Generic Command Status */ + case NVME_SC_SUCCESS: + status = SAM_STAT_GOOD; + sense_key = NO_SENSE; + asc = SCSI_ASC_NO_SENSE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_INVALID_OPCODE: + status = SAM_STAT_CHECK_CONDITION; + sense_key = ILLEGAL_REQUEST; + asc = SCSI_ASC_ILLEGAL_COMMAND; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_INVALID_FIELD: + status = SAM_STAT_CHECK_CONDITION; + sense_key = ILLEGAL_REQUEST; + asc = SCSI_ASC_INVALID_CDB; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_DATA_XFER_ERROR: + status = SAM_STAT_CHECK_CONDITION; + sense_key = MEDIUM_ERROR; + asc = SCSI_ASC_NO_SENSE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_POWER_LOSS: + status = SAM_STAT_TASK_ABORTED; + sense_key = ABORTED_COMMAND; + asc = SCSI_ASC_WARNING; + ascq = SCSI_ASCQ_POWER_LOSS_EXPECTED; + break; + case NVME_SC_INTERNAL: + status = SAM_STAT_CHECK_CONDITION; + sense_key = HARDWARE_ERROR; + asc = SCSI_ASC_INTERNAL_TARGET_FAILURE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_ABORT_REQ: + status = SAM_STAT_TASK_ABORTED; + sense_key = ABORTED_COMMAND; + asc = SCSI_ASC_NO_SENSE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_ABORT_QUEUE: + status = SAM_STAT_TASK_ABORTED; + sense_key = ABORTED_COMMAND; + asc = SCSI_ASC_NO_SENSE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_FUSED_FAIL: + status = SAM_STAT_TASK_ABORTED; + sense_key = ABORTED_COMMAND; + asc = SCSI_ASC_NO_SENSE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_FUSED_MISSING: + status = SAM_STAT_TASK_ABORTED; + sense_key = ABORTED_COMMAND; + asc = SCSI_ASC_NO_SENSE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_INVALID_NS: + status = SAM_STAT_CHECK_CONDITION; + sense_key = ILLEGAL_REQUEST; + asc = SCSI_ASC_ACCESS_DENIED_INVALID_LUN_ID; + ascq = SCSI_ASCQ_INVALID_LUN_ID; + break; + case NVME_SC_LBA_RANGE: + status = SAM_STAT_CHECK_CONDITION; + sense_key = ILLEGAL_REQUEST; + asc = SCSI_ASC_ILLEGAL_BLOCK; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_CAP_EXCEEDED: + status = SAM_STAT_CHECK_CONDITION; + sense_key = MEDIUM_ERROR; + asc = SCSI_ASC_NO_SENSE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_NS_NOT_READY: + status = SAM_STAT_CHECK_CONDITION; + sense_key = NOT_READY; + asc = SCSI_ASC_LUN_NOT_READY; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + + /* Command Specific Status */ + case NVME_SC_INVALID_FORMAT: + status = SAM_STAT_CHECK_CONDITION; + sense_key = ILLEGAL_REQUEST; + asc = SCSI_ASC_FORMAT_COMMAND_FAILED; + ascq = SCSI_ASCQ_FORMAT_COMMAND_FAILED; + break; + case NVME_SC_BAD_ATTRIBUTES: + status = SAM_STAT_CHECK_CONDITION; + sense_key = ILLEGAL_REQUEST; + asc = SCSI_ASC_INVALID_CDB; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + + /* Media Errors */ + case NVME_SC_WRITE_FAULT: + status = SAM_STAT_CHECK_CONDITION; + sense_key = MEDIUM_ERROR; + asc = SCSI_ASC_PERIPHERAL_DEV_WRITE_FAULT; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_READ_ERROR: + status = SAM_STAT_CHECK_CONDITION; + sense_key = MEDIUM_ERROR; + asc = SCSI_ASC_UNRECOVERED_READ_ERROR; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_GUARD_CHECK: + status = SAM_STAT_CHECK_CONDITION; + sense_key = MEDIUM_ERROR; + asc = SCSI_ASC_LOG_BLOCK_GUARD_CHECK_FAILED; + ascq = SCSI_ASCQ_LOG_BLOCK_GUARD_CHECK_FAILED; + break; + case NVME_SC_APPTAG_CHECK: + status = SAM_STAT_CHECK_CONDITION; + sense_key = MEDIUM_ERROR; + asc = SCSI_ASC_LOG_BLOCK_APPTAG_CHECK_FAILED; + ascq = SCSI_ASCQ_LOG_BLOCK_APPTAG_CHECK_FAILED; + break; + case NVME_SC_REFTAG_CHECK: + status = SAM_STAT_CHECK_CONDITION; + sense_key = MEDIUM_ERROR; + asc = SCSI_ASC_LOG_BLOCK_REFTAG_CHECK_FAILED; + ascq = SCSI_ASCQ_LOG_BLOCK_REFTAG_CHECK_FAILED; + break; + case NVME_SC_COMPARE_FAILED: + status = SAM_STAT_CHECK_CONDITION; + sense_key = MISCOMPARE; + asc = SCSI_ASC_MISCOMPARE_DURING_VERIFY; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + case NVME_SC_ACCESS_DENIED: + status = SAM_STAT_CHECK_CONDITION; + sense_key = ILLEGAL_REQUEST; + asc = SCSI_ASC_ACCESS_DENIED_INVALID_LUN_ID; + ascq = SCSI_ASCQ_INVALID_LUN_ID; + break; + + /* Unspecified/Default */ + case NVME_SC_CMDID_CONFLICT: + case NVME_SC_CMD_SEQ_ERROR: + case NVME_SC_CQ_INVALID: + case NVME_SC_QID_INVALID: + case NVME_SC_QUEUE_SIZE: + case NVME_SC_ABORT_LIMIT: + case NVME_SC_ABORT_MISSING: + case NVME_SC_ASYNC_LIMIT: + case NVME_SC_FIRMWARE_SLOT: + case NVME_SC_FIRMWARE_IMAGE: + case NVME_SC_INVALID_VECTOR: + case NVME_SC_INVALID_LOG_PAGE: + default: + status = SAM_STAT_CHECK_CONDITION; + sense_key = ILLEGAL_REQUEST; + asc = SCSI_ASC_NO_SENSE; + ascq = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + break; + } + + res = nvme_trans_completion(hdr, status, sense_key, asc, ascq); + + return res; +} + +/* INQUIRY Helper Functions */ + +static int nvme_trans_standard_inquiry_page(struct nvme_ns *ns, + struct sg_io_hdr *hdr, u8 *inq_response, + int alloc_len) +{ + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ns *id_ns; + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + int xfer_len; + u8 resp_data_format = 0x02; + u8 protect; + u8 cmdque = 0x01 << 1; + + mem = dma_alloc_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out_dma; + } + + /* nvme ns identify - use DPS value for PROTECT field */ + nvme_sc = nvme_identify(dev, ns->ns_id, 0, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + /* + * If nvme_sc was -ve, res will be -ve here. + * If nvme_sc was +ve, the status would bace been translated, and res + * can only be 0 or -ve. + * - If 0 && nvme_sc > 0, then go into next if where res gets nvme_sc + * - If -ve, return because its a Linux error. + */ + if (res) + goto out_free; + if (nvme_sc) { + res = nvme_sc; + goto out_free; + } + id_ns = mem; + (id_ns->dps) ? (protect = 0x01) : (protect = 0); + + memset(inq_response, 0, STANDARD_INQUIRY_LENGTH); + inq_response[2] = VERSION_SPC_4; + inq_response[3] = resp_data_format; /*normaca=0 | hisup=0 */ + inq_response[4] = ADDITIONAL_STD_INQ_LENGTH; + inq_response[5] = protect; /* sccs=0 | acc=0 | tpgs=0 | pc3=0 */ + inq_response[7] = cmdque; /* wbus16=0 | sync=0 | vs=0 */ + strncpy(&inq_response[8], "NVMe ", 8); + strncpy(&inq_response[16], dev->model, 16); + strncpy(&inq_response[32], dev->firmware_rev, 4); + + xfer_len = min(alloc_len, STANDARD_INQUIRY_LENGTH); + res = nvme_trans_copy_to_user(hdr, inq_response, xfer_len); + + out_free: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), mem, + dma_addr); + out_dma: + return res; +} + +static int nvme_trans_supported_vpd_pages(struct nvme_ns *ns, + struct sg_io_hdr *hdr, u8 *inq_response, + int alloc_len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int xfer_len; + + memset(inq_response, 0, STANDARD_INQUIRY_LENGTH); + inq_response[1] = INQ_SUPPORTED_VPD_PAGES_PAGE; /* Page Code */ + inq_response[3] = INQ_NUM_SUPPORTED_VPD_PAGES; /* Page Length */ + inq_response[4] = INQ_SUPPORTED_VPD_PAGES_PAGE; + inq_response[5] = INQ_UNIT_SERIAL_NUMBER_PAGE; + inq_response[6] = INQ_DEVICE_IDENTIFICATION_PAGE; + inq_response[7] = INQ_EXTENDED_INQUIRY_DATA_PAGE; + inq_response[8] = INQ_BDEV_CHARACTERISTICS_PAGE; + + xfer_len = min(alloc_len, STANDARD_INQUIRY_LENGTH); + res = nvme_trans_copy_to_user(hdr, inq_response, xfer_len); + + return res; +} + +static int nvme_trans_unit_serial_page(struct nvme_ns *ns, + struct sg_io_hdr *hdr, u8 *inq_response, + int alloc_len) +{ + struct nvme_dev *dev = ns->dev; + int res = SNTI_TRANSLATION_SUCCESS; + int xfer_len; + + memset(inq_response, 0, STANDARD_INQUIRY_LENGTH); + inq_response[1] = INQ_UNIT_SERIAL_NUMBER_PAGE; /* Page Code */ + inq_response[3] = INQ_SERIAL_NUMBER_LENGTH; /* Page Length */ + strncpy(&inq_response[4], dev->serial, INQ_SERIAL_NUMBER_LENGTH); + + xfer_len = min(alloc_len, STANDARD_INQUIRY_LENGTH); + res = nvme_trans_copy_to_user(hdr, inq_response, xfer_len); + + return res; +} + +static int nvme_trans_device_id_page(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *inq_response, int alloc_len) +{ + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ctrl *id_ctrl; + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + u8 ieee[4]; + int xfer_len; + __be32 tmp_id = cpu_to_be32(ns->ns_id); + + mem = dma_alloc_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out_dma; + } + + /* nvme controller identify */ + nvme_sc = nvme_identify(dev, 0, 1, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_free; + if (nvme_sc) { + res = nvme_sc; + goto out_free; + } + id_ctrl = mem; + + /* Since SCSI tried to save 4 bits... [SPC-4(r34) Table 591] */ + ieee[0] = id_ctrl->ieee[0] << 4; + ieee[1] = id_ctrl->ieee[0] >> 4 | id_ctrl->ieee[1] << 4; + ieee[2] = id_ctrl->ieee[1] >> 4 | id_ctrl->ieee[2] << 4; + ieee[3] = id_ctrl->ieee[2] >> 4; + + memset(inq_response, 0, STANDARD_INQUIRY_LENGTH); + inq_response[1] = INQ_DEVICE_IDENTIFICATION_PAGE; /* Page Code */ + inq_response[3] = 20; /* Page Length */ + /* Designation Descriptor start */ + inq_response[4] = 0x01; /* Proto ID=0h | Code set=1h */ + inq_response[5] = 0x03; /* PIV=0b | Asso=00b | Designator Type=3h */ + inq_response[6] = 0x00; /* Rsvd */ + inq_response[7] = 16; /* Designator Length */ + /* Designator start */ + inq_response[8] = 0x60 | ieee[3]; /* NAA=6h | IEEE ID MSB, High nibble*/ + inq_response[9] = ieee[2]; /* IEEE ID */ + inq_response[10] = ieee[1]; /* IEEE ID */ + inq_response[11] = ieee[0]; /* IEEE ID| Vendor Specific ID... */ + inq_response[12] = (dev->pci_dev->vendor & 0xFF00) >> 8; + inq_response[13] = (dev->pci_dev->vendor & 0x00FF); + inq_response[14] = dev->serial[0]; + inq_response[15] = dev->serial[1]; + inq_response[16] = dev->model[0]; + inq_response[17] = dev->model[1]; + memcpy(&inq_response[18], &tmp_id, sizeof(u32)); + /* Last 2 bytes are zero */ + + xfer_len = min(alloc_len, STANDARD_INQUIRY_LENGTH); + res = nvme_trans_copy_to_user(hdr, inq_response, xfer_len); + + out_free: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), mem, + dma_addr); + out_dma: + return res; +} + +static int nvme_trans_ext_inq_page(struct nvme_ns *ns, struct sg_io_hdr *hdr, + int alloc_len) +{ + u8 *inq_response; + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ctrl *id_ctrl; + struct nvme_id_ns *id_ns; + int xfer_len; + u8 microcode = 0x80; + u8 spt; + u8 spt_lut[8] = {0, 0, 2, 1, 4, 6, 5, 7}; + u8 grd_chk, app_chk, ref_chk, protect; + u8 uask_sup = 0x20; + u8 v_sup; + u8 luiclr = 0x01; + + inq_response = kmalloc(EXTENDED_INQUIRY_DATA_PAGE_LENGTH, GFP_KERNEL); + if (inq_response == NULL) { + res = -ENOMEM; + goto out_mem; + } + + mem = dma_alloc_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out_dma; + } + + /* nvme ns identify */ + nvme_sc = nvme_identify(dev, ns->ns_id, 0, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_free; + if (nvme_sc) { + res = nvme_sc; + goto out_free; + } + id_ns = mem; + spt = spt_lut[(id_ns->dpc) & 0x07] << 3; + (id_ns->dps) ? (protect = 0x01) : (protect = 0); + grd_chk = protect << 2; + app_chk = protect << 1; + ref_chk = protect; + + /* nvme controller identify */ + nvme_sc = nvme_identify(dev, 0, 1, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_free; + if (nvme_sc) { + res = nvme_sc; + goto out_free; + } + id_ctrl = mem; + v_sup = id_ctrl->vwc; + + memset(inq_response, 0, EXTENDED_INQUIRY_DATA_PAGE_LENGTH); + inq_response[1] = INQ_EXTENDED_INQUIRY_DATA_PAGE; /* Page Code */ + inq_response[2] = 0x00; /* Page Length MSB */ + inq_response[3] = 0x3C; /* Page Length LSB */ + inq_response[4] = microcode | spt | grd_chk | app_chk | ref_chk; + inq_response[5] = uask_sup; + inq_response[6] = v_sup; + inq_response[7] = luiclr; + inq_response[8] = 0; + inq_response[9] = 0; + + xfer_len = min(alloc_len, EXTENDED_INQUIRY_DATA_PAGE_LENGTH); + res = nvme_trans_copy_to_user(hdr, inq_response, xfer_len); + + out_free: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), mem, + dma_addr); + out_dma: + kfree(inq_response); + out_mem: + return res; +} + +static int nvme_trans_bdev_char_page(struct nvme_ns *ns, struct sg_io_hdr *hdr, + int alloc_len) +{ + u8 *inq_response; + int res = SNTI_TRANSLATION_SUCCESS; + int xfer_len; + + inq_response = kmalloc(EXTENDED_INQUIRY_DATA_PAGE_LENGTH, GFP_KERNEL); + if (inq_response == NULL) { + res = -ENOMEM; + goto out_mem; + } + + memset(inq_response, 0, EXTENDED_INQUIRY_DATA_PAGE_LENGTH); + inq_response[1] = INQ_BDEV_CHARACTERISTICS_PAGE; /* Page Code */ + inq_response[2] = 0x00; /* Page Length MSB */ + inq_response[3] = 0x3C; /* Page Length LSB */ + inq_response[4] = 0x00; /* Medium Rotation Rate MSB */ + inq_response[5] = 0x01; /* Medium Rotation Rate LSB */ + inq_response[6] = 0x00; /* Form Factor */ + + xfer_len = min(alloc_len, EXTENDED_INQUIRY_DATA_PAGE_LENGTH); + res = nvme_trans_copy_to_user(hdr, inq_response, xfer_len); + + kfree(inq_response); + out_mem: + return res; +} + +/* LOG SENSE Helper Functions */ + +static int nvme_trans_log_supp_pages(struct nvme_ns *ns, struct sg_io_hdr *hdr, + int alloc_len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int xfer_len; + u8 *log_response; + + log_response = kmalloc(LOG_PAGE_SUPPORTED_LOG_PAGES_LENGTH, GFP_KERNEL); + if (log_response == NULL) { + res = -ENOMEM; + goto out_mem; + } + memset(log_response, 0, LOG_PAGE_SUPPORTED_LOG_PAGES_LENGTH); + + log_response[0] = LOG_PAGE_SUPPORTED_LOG_PAGES_PAGE; + /* Subpage=0x00, Page Length MSB=0 */ + log_response[3] = SUPPORTED_LOG_PAGES_PAGE_LENGTH; + log_response[4] = LOG_PAGE_SUPPORTED_LOG_PAGES_PAGE; + log_response[5] = LOG_PAGE_INFORMATIONAL_EXCEPTIONS_PAGE; + log_response[6] = LOG_PAGE_TEMPERATURE_PAGE; + + xfer_len = min(alloc_len, LOG_PAGE_SUPPORTED_LOG_PAGES_LENGTH); + res = nvme_trans_copy_to_user(hdr, log_response, xfer_len); + + kfree(log_response); + out_mem: + return res; +} + +static int nvme_trans_log_info_exceptions(struct nvme_ns *ns, + struct sg_io_hdr *hdr, int alloc_len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int xfer_len; + u8 *log_response; + struct nvme_command c; + struct nvme_dev *dev = ns->dev; + struct nvme_smart_log *smart_log; + dma_addr_t dma_addr; + void *mem; + u8 temp_c; + u16 temp_k; + + log_response = kmalloc(LOG_INFO_EXCP_PAGE_LENGTH, GFP_KERNEL); + if (log_response == NULL) { + res = -ENOMEM; + goto out_mem; + } + memset(log_response, 0, LOG_INFO_EXCP_PAGE_LENGTH); + + mem = dma_alloc_coherent(&dev->pci_dev->dev, + sizeof(struct nvme_smart_log), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out_dma; + } + + /* Get SMART Log Page */ + memset(&c, 0, sizeof(c)); + c.common.opcode = nvme_admin_get_log_page; + c.common.nsid = cpu_to_le32(0xFFFFFFFF); + c.common.prp1 = cpu_to_le64(dma_addr); + c.common.cdw10[0] = cpu_to_le32(((sizeof(struct nvme_smart_log) / + BYTES_TO_DWORDS) << 16) | NVME_GET_SMART_LOG_PAGE); + res = nvme_submit_admin_cmd(dev, &c, NULL); + if (res != NVME_SC_SUCCESS) { + temp_c = LOG_TEMP_UNKNOWN; + } else { + smart_log = mem; + temp_k = (smart_log->temperature[1] << 8) + + (smart_log->temperature[0]); + temp_c = temp_k - KELVIN_TEMP_FACTOR; + } + + log_response[0] = LOG_PAGE_INFORMATIONAL_EXCEPTIONS_PAGE; + /* Subpage=0x00, Page Length MSB=0 */ + log_response[3] = REMAINING_INFO_EXCP_PAGE_LENGTH; + /* Informational Exceptions Log Parameter 1 Start */ + /* Parameter Code=0x0000 bytes 4,5 */ + log_response[6] = 0x23; /* DU=0, TSD=1, ETC=0, TMC=0, FMT_AND_LNK=11b */ + log_response[7] = 0x04; /* PARAMETER LENGTH */ + /* Add sense Code and qualifier = 0x00 each */ + /* Use Temperature from NVMe Get Log Page, convert to C from K */ + log_response[10] = temp_c; + + xfer_len = min(alloc_len, LOG_INFO_EXCP_PAGE_LENGTH); + res = nvme_trans_copy_to_user(hdr, log_response, xfer_len); + + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_smart_log), + mem, dma_addr); + out_dma: + kfree(log_response); + out_mem: + return res; +} + +static int nvme_trans_log_temperature(struct nvme_ns *ns, struct sg_io_hdr *hdr, + int alloc_len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int xfer_len; + u8 *log_response; + struct nvme_command c; + struct nvme_dev *dev = ns->dev; + struct nvme_smart_log *smart_log; + dma_addr_t dma_addr; + void *mem; + u32 feature_resp; + u8 temp_c_cur, temp_c_thresh; + u16 temp_k; + + log_response = kmalloc(LOG_TEMP_PAGE_LENGTH, GFP_KERNEL); + if (log_response == NULL) { + res = -ENOMEM; + goto out_mem; + } + memset(log_response, 0, LOG_TEMP_PAGE_LENGTH); + + mem = dma_alloc_coherent(&dev->pci_dev->dev, + sizeof(struct nvme_smart_log), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out_dma; + } + + /* Get SMART Log Page */ + memset(&c, 0, sizeof(c)); + c.common.opcode = nvme_admin_get_log_page; + c.common.nsid = cpu_to_le32(0xFFFFFFFF); + c.common.prp1 = cpu_to_le64(dma_addr); + c.common.cdw10[0] = cpu_to_le32(((sizeof(struct nvme_smart_log) / + BYTES_TO_DWORDS) << 16) | NVME_GET_SMART_LOG_PAGE); + res = nvme_submit_admin_cmd(dev, &c, NULL); + if (res != NVME_SC_SUCCESS) { + temp_c_cur = LOG_TEMP_UNKNOWN; + } else { + smart_log = mem; + temp_k = (smart_log->temperature[1] << 8) + + (smart_log->temperature[0]); + temp_c_cur = temp_k - KELVIN_TEMP_FACTOR; + } + + /* Get Features for Temp Threshold */ + res = nvme_get_features(dev, NVME_FEAT_TEMP_THRESH, 0, 0, + &feature_resp); + if (res != NVME_SC_SUCCESS) + temp_c_thresh = LOG_TEMP_UNKNOWN; + else + temp_c_thresh = (feature_resp & 0xFFFF) - KELVIN_TEMP_FACTOR; + + log_response[0] = LOG_PAGE_TEMPERATURE_PAGE; + /* Subpage=0x00, Page Length MSB=0 */ + log_response[3] = REMAINING_TEMP_PAGE_LENGTH; + /* Temperature Log Parameter 1 (Temperature) Start */ + /* Parameter Code = 0x0000 */ + log_response[6] = 0x01; /* Format and Linking = 01b */ + log_response[7] = 0x02; /* Parameter Length */ + /* Use Temperature from NVMe Get Log Page, convert to C from K */ + log_response[9] = temp_c_cur; + /* Temperature Log Parameter 2 (Reference Temperature) Start */ + log_response[11] = 0x01; /* Parameter Code = 0x0001 */ + log_response[12] = 0x01; /* Format and Linking = 01b */ + log_response[13] = 0x02; /* Parameter Length */ + /* Use Temperature Thresh from NVMe Get Log Page, convert to C from K */ + log_response[15] = temp_c_thresh; + + xfer_len = min(alloc_len, LOG_TEMP_PAGE_LENGTH); + res = nvme_trans_copy_to_user(hdr, log_response, xfer_len); + + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_smart_log), + mem, dma_addr); + out_dma: + kfree(log_response); + out_mem: + return res; +} + +/* MODE SENSE Helper Functions */ + +static int nvme_trans_fill_mode_parm_hdr(u8 *resp, int len, u8 cdb10, u8 llbaa, + u16 mode_data_length, u16 blk_desc_len) +{ + /* Quick check to make sure I don't stomp on my own memory... */ + if ((cdb10 && len < 8) || (!cdb10 && len < 4)) + return SNTI_INTERNAL_ERROR; + + if (cdb10) { + resp[0] = (mode_data_length & 0xFF00) >> 8; + resp[1] = (mode_data_length & 0x00FF); + /* resp[2] and [3] are zero */ + resp[4] = llbaa; + resp[5] = RESERVED_FIELD; + resp[6] = (blk_desc_len & 0xFF00) >> 8; + resp[7] = (blk_desc_len & 0x00FF); + } else { + resp[0] = (mode_data_length & 0x00FF); + /* resp[1] and [2] are zero */ + resp[3] = (blk_desc_len & 0x00FF); + } + + return SNTI_TRANSLATION_SUCCESS; +} + +static int nvme_trans_fill_blk_desc(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *resp, int len, u8 llbaa) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ns *id_ns; + u8 flbas; + u32 lba_length; + + if (llbaa == 0 && len < MODE_PAGE_BLK_DES_LEN) + return SNTI_INTERNAL_ERROR; + else if (llbaa > 0 && len < MODE_PAGE_LLBAA_BLK_DES_LEN) + return SNTI_INTERNAL_ERROR; + + mem = dma_alloc_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out; + } + + /* nvme ns identify */ + nvme_sc = nvme_identify(dev, ns->ns_id, 0, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_dma; + if (nvme_sc) { + res = nvme_sc; + goto out_dma; + } + id_ns = mem; + flbas = (id_ns->flbas) & 0x0F; + lba_length = (1 << (id_ns->lbaf[flbas].ds)); + + if (llbaa == 0) { + __be32 tmp_cap = cpu_to_be32(le64_to_cpu(id_ns->ncap)); + /* Byte 4 is reserved */ + __be32 tmp_len = cpu_to_be32(lba_length & 0x00FFFFFF); + + memcpy(resp, &tmp_cap, sizeof(u32)); + memcpy(&resp[4], &tmp_len, sizeof(u32)); + } else { + __be64 tmp_cap = cpu_to_be64(le64_to_cpu(id_ns->ncap)); + __be32 tmp_len = cpu_to_be32(lba_length); + + memcpy(resp, &tmp_cap, sizeof(u64)); + /* Bytes 8, 9, 10, 11 are reserved */ + memcpy(&resp[12], &tmp_len, sizeof(u32)); + } + + out_dma: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), mem, + dma_addr); + out: + return res; +} + +static int nvme_trans_fill_control_page(struct nvme_ns *ns, + struct sg_io_hdr *hdr, u8 *resp, + int len) +{ + if (len < MODE_PAGE_CONTROL_LEN) + return SNTI_INTERNAL_ERROR; + + resp[0] = MODE_PAGE_CONTROL; + resp[1] = MODE_PAGE_CONTROL_LEN_FIELD; + resp[2] = 0x0E; /* TST=000b, TMF_ONLY=0, DPICZ=1, + * D_SENSE=1, GLTSD=1, RLEC=0 */ + resp[3] = 0x12; /* Q_ALGO_MODIFIER=1h, NUAR=0, QERR=01b */ + /* Byte 4: VS=0, RAC=0, UA_INT=0, SWP=0 */ + resp[5] = 0x40; /* ATO=0, TAS=1, ATMPE=0, RWWP=0, AUTOLOAD=0 */ + /* resp[6] and [7] are obsolete, thus zero */ + resp[8] = 0xFF; /* Busy timeout period = 0xffff */ + resp[9] = 0xFF; + /* Bytes 10,11: Extended selftest completion time = 0x0000 */ + + return SNTI_TRANSLATION_SUCCESS; +} + +static int nvme_trans_fill_caching_page(struct nvme_ns *ns, + struct sg_io_hdr *hdr, + u8 *resp, int len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + u32 feature_resp; + u8 vwc; + + if (len < MODE_PAGE_CACHING_LEN) + return SNTI_INTERNAL_ERROR; + + nvme_sc = nvme_get_features(dev, NVME_FEAT_VOLATILE_WC, 0, 0, + &feature_resp); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out; + if (nvme_sc) { + res = nvme_sc; + goto out; + } + vwc = feature_resp & 0x00000001; + + resp[0] = MODE_PAGE_CACHING; + resp[1] = MODE_PAGE_CACHING_LEN_FIELD; + resp[2] = vwc << 2; + + out: + return res; +} + +static int nvme_trans_fill_pow_cnd_page(struct nvme_ns *ns, + struct sg_io_hdr *hdr, u8 *resp, + int len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + + if (len < MODE_PAGE_POW_CND_LEN) + return SNTI_INTERNAL_ERROR; + + resp[0] = MODE_PAGE_POWER_CONDITION; + resp[1] = MODE_PAGE_POW_CND_LEN_FIELD; + /* All other bytes are zero */ + + return res; +} + +static int nvme_trans_fill_inf_exc_page(struct nvme_ns *ns, + struct sg_io_hdr *hdr, u8 *resp, + int len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + + if (len < MODE_PAGE_INF_EXC_LEN) + return SNTI_INTERNAL_ERROR; + + resp[0] = MODE_PAGE_INFO_EXCEP; + resp[1] = MODE_PAGE_INF_EXC_LEN_FIELD; + resp[2] = 0x88; + /* All other bytes are zero */ + + return res; +} + +static int nvme_trans_fill_all_pages(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *resp, int len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u16 mode_pages_offset_1 = 0; + u16 mode_pages_offset_2, mode_pages_offset_3, mode_pages_offset_4; + + mode_pages_offset_2 = mode_pages_offset_1 + MODE_PAGE_CACHING_LEN; + mode_pages_offset_3 = mode_pages_offset_2 + MODE_PAGE_CONTROL_LEN; + mode_pages_offset_4 = mode_pages_offset_3 + MODE_PAGE_POW_CND_LEN; + + res = nvme_trans_fill_caching_page(ns, hdr, &resp[mode_pages_offset_1], + MODE_PAGE_CACHING_LEN); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + res = nvme_trans_fill_control_page(ns, hdr, &resp[mode_pages_offset_2], + MODE_PAGE_CONTROL_LEN); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + res = nvme_trans_fill_pow_cnd_page(ns, hdr, &resp[mode_pages_offset_3], + MODE_PAGE_POW_CND_LEN); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + res = nvme_trans_fill_inf_exc_page(ns, hdr, &resp[mode_pages_offset_4], + MODE_PAGE_INF_EXC_LEN); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + + out: + return res; +} + +static inline int nvme_trans_get_blk_desc_len(u8 dbd, u8 llbaa) +{ + if (dbd == MODE_SENSE_BLK_DESC_ENABLED) { + /* SPC-4: len = 8 x Num_of_descriptors if llbaa = 0, 16x if 1 */ + return 8 * (llbaa + 1) * MODE_SENSE_BLK_DESC_COUNT; + } else { + return 0; + } +} + +static int nvme_trans_mode_page_create(struct nvme_ns *ns, + struct sg_io_hdr *hdr, u8 *cmd, + u16 alloc_len, u8 cdb10, + int (*mode_page_fill_func) + (struct nvme_ns *, + struct sg_io_hdr *hdr, u8 *, int), + u16 mode_pages_tot_len) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int xfer_len; + u8 *response; + u8 dbd, llbaa; + u16 resp_size; + int mph_size; + u16 mode_pages_offset_1; + u16 blk_desc_len, blk_desc_offset, mode_data_length; + + dbd = GET_MODE_SENSE_DBD(cmd); + llbaa = GET_MODE_SENSE_LLBAA(cmd); + mph_size = GET_MODE_SENSE_MPH_SIZE(cdb10); + blk_desc_len = nvme_trans_get_blk_desc_len(dbd, llbaa); + + resp_size = mph_size + blk_desc_len + mode_pages_tot_len; + /* Refer spc4r34 Table 440 for calculation of Mode data Length field */ + mode_data_length = 3 + (3 * cdb10) + blk_desc_len + mode_pages_tot_len; + + blk_desc_offset = mph_size; + mode_pages_offset_1 = blk_desc_offset + blk_desc_len; + + response = kmalloc(resp_size, GFP_KERNEL); + if (response == NULL) { + res = -ENOMEM; + goto out_mem; + } + memset(response, 0, resp_size); + + res = nvme_trans_fill_mode_parm_hdr(&response[0], mph_size, cdb10, + llbaa, mode_data_length, blk_desc_len); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out_free; + if (blk_desc_len > 0) { + res = nvme_trans_fill_blk_desc(ns, hdr, + &response[blk_desc_offset], + blk_desc_len, llbaa); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out_free; + } + res = mode_page_fill_func(ns, hdr, &response[mode_pages_offset_1], + mode_pages_tot_len); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out_free; + + xfer_len = min(alloc_len, resp_size); + res = nvme_trans_copy_to_user(hdr, response, xfer_len); + + out_free: + kfree(response); + out_mem: + return res; +} + +/* Read Capacity Helper Functions */ + +static void nvme_trans_fill_read_cap(u8 *response, struct nvme_id_ns *id_ns, + u8 cdb16) +{ + u8 flbas; + u32 lba_length; + u64 rlba; + u8 prot_en; + u8 p_type_lut[4] = {0, 0, 1, 2}; + __be64 tmp_rlba; + __be32 tmp_rlba_32; + __be32 tmp_len; + + flbas = (id_ns->flbas) & 0x0F; + lba_length = (1 << (id_ns->lbaf[flbas].ds)); + rlba = le64_to_cpup(&id_ns->nsze) - 1; + (id_ns->dps) ? (prot_en = 0x01) : (prot_en = 0); + + if (!cdb16) { + if (rlba > 0xFFFFFFFF) + rlba = 0xFFFFFFFF; + tmp_rlba_32 = cpu_to_be32(rlba); + tmp_len = cpu_to_be32(lba_length); + memcpy(response, &tmp_rlba_32, sizeof(u32)); + memcpy(&response[4], &tmp_len, sizeof(u32)); + } else { + tmp_rlba = cpu_to_be64(rlba); + tmp_len = cpu_to_be32(lba_length); + memcpy(response, &tmp_rlba, sizeof(u64)); + memcpy(&response[8], &tmp_len, sizeof(u32)); + response[12] = (p_type_lut[id_ns->dps & 0x3] << 1) | prot_en; + /* P_I_Exponent = 0x0 | LBPPBE = 0x0 */ + /* LBPME = 0 | LBPRZ = 0 | LALBA = 0x00 */ + /* Bytes 16-31 - Reserved */ + } +} + +/* Start Stop Unit Helper Functions */ + +static int nvme_trans_power_state(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 pc, u8 pcmod, u8 start) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ctrl *id_ctrl; + int lowest_pow_st; /* max npss = lowest power consumption */ + unsigned ps_desired = 0; + + /* NVMe Controller Identify */ + mem = dma_alloc_coherent(&dev->pci_dev->dev, + sizeof(struct nvme_id_ctrl), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out; + } + nvme_sc = nvme_identify(dev, 0, 1, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_dma; + if (nvme_sc) { + res = nvme_sc; + goto out_dma; + } + id_ctrl = mem; + lowest_pow_st = id_ctrl->npss - 1; + + switch (pc) { + case NVME_POWER_STATE_START_VALID: + /* Action unspecified if POWER CONDITION MODIFIER != 0 */ + if (pcmod == 0 && start == 0x1) + ps_desired = POWER_STATE_0; + if (pcmod == 0 && start == 0x0) + ps_desired = lowest_pow_st; + break; + case NVME_POWER_STATE_ACTIVE: + /* Action unspecified if POWER CONDITION MODIFIER != 0 */ + if (pcmod == 0) + ps_desired = POWER_STATE_0; + break; + case NVME_POWER_STATE_IDLE: + /* Action unspecified if POWER CONDITION MODIFIER != [0,1,2] */ + /* min of desired state and (lps-1) because lps is STOP */ + if (pcmod == 0x0) + ps_desired = min(POWER_STATE_1, (lowest_pow_st - 1)); + else if (pcmod == 0x1) + ps_desired = min(POWER_STATE_2, (lowest_pow_st - 1)); + else if (pcmod == 0x2) + ps_desired = min(POWER_STATE_3, (lowest_pow_st - 1)); + break; + case NVME_POWER_STATE_STANDBY: + /* Action unspecified if POWER CONDITION MODIFIER != [0,1] */ + if (pcmod == 0x0) + ps_desired = max(0, (lowest_pow_st - 2)); + else if (pcmod == 0x1) + ps_desired = max(0, (lowest_pow_st - 1)); + break; + case NVME_POWER_STATE_LU_CONTROL: + default: + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + break; + } + nvme_sc = nvme_set_features(dev, NVME_FEAT_POWER_MGMT, ps_desired, 0, + NULL); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_dma; + if (nvme_sc) + res = nvme_sc; + out_dma: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ctrl), mem, + dma_addr); + out: + return res; +} + +/* Write Buffer Helper Functions */ +/* Also using this for Format Unit with hdr passed as NULL, and buffer_id, 0 */ + +static int nvme_trans_send_fw_cmd(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 opcode, u32 tot_len, u32 offset, + u8 buffer_id) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + struct nvme_command c; + struct nvme_iod *iod = NULL; + unsigned length; + + memset(&c, 0, sizeof(c)); + c.common.opcode = opcode; + if (opcode == nvme_admin_download_fw) { + if (hdr->iovec_count > 0) { + /* Assuming SGL is not allowed for this command */ + res = nvme_trans_completion(hdr, + SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, + SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } + iod = nvme_map_user_pages(dev, DMA_TO_DEVICE, + (unsigned long)hdr->dxferp, tot_len); + if (IS_ERR(iod)) { + res = PTR_ERR(iod); + goto out; + } + length = nvme_setup_prps(dev, &c.common, iod, tot_len, + GFP_KERNEL); + if (length != tot_len) { + res = -ENOMEM; + goto out_unmap; + } + + c.dlfw.numd = cpu_to_le32((tot_len/BYTES_TO_DWORDS) - 1); + c.dlfw.offset = cpu_to_le32(offset/BYTES_TO_DWORDS); + } else if (opcode == nvme_admin_activate_fw) { + u32 cdw10 = buffer_id | NVME_FWACT_REPL_ACTV; + c.common.cdw10[0] = cpu_to_le32(cdw10); + } + + nvme_sc = nvme_submit_admin_cmd(dev, &c, NULL); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_unmap; + if (nvme_sc) + res = nvme_sc; + + out_unmap: + if (opcode == nvme_admin_download_fw) { + nvme_unmap_user_pages(dev, DMA_TO_DEVICE, iod); + nvme_free_iod(dev, iod); + } + out: + return res; +} + +/* Mode Select Helper Functions */ + +static inline void nvme_trans_modesel_get_bd_len(u8 *parm_list, u8 cdb10, + u16 *bd_len, u8 *llbaa) +{ + if (cdb10) { + /* 10 Byte CDB */ + *bd_len = (parm_list[MODE_SELECT_10_BD_OFFSET] << 8) + + parm_list[MODE_SELECT_10_BD_OFFSET + 1]; + *llbaa = parm_list[MODE_SELECT_10_LLBAA_OFFSET] && + MODE_SELECT_10_LLBAA_MASK; + } else { + /* 6 Byte CDB */ + *bd_len = parm_list[MODE_SELECT_6_BD_OFFSET]; + } +} + +static void nvme_trans_modesel_save_bd(struct nvme_ns *ns, u8 *parm_list, + u16 idx, u16 bd_len, u8 llbaa) +{ + u16 bd_num; + + bd_num = bd_len / ((llbaa == 0) ? + SHORT_DESC_BLOCK : LONG_DESC_BLOCK); + /* Store block descriptor info if a FORMAT UNIT comes later */ + /* TODO Saving 1st BD info; what to do if multiple BD received? */ + if (llbaa == 0) { + /* Standard Block Descriptor - spc4r34 7.5.5.1 */ + ns->mode_select_num_blocks = + (parm_list[idx + 1] << 16) + + (parm_list[idx + 2] << 8) + + (parm_list[idx + 3]); + + ns->mode_select_block_len = + (parm_list[idx + 5] << 16) + + (parm_list[idx + 6] << 8) + + (parm_list[idx + 7]); + } else { + /* Long LBA Block Descriptor - sbc3r27 6.4.2.3 */ + ns->mode_select_num_blocks = + (((u64)parm_list[idx + 0]) << 56) + + (((u64)parm_list[idx + 1]) << 48) + + (((u64)parm_list[idx + 2]) << 40) + + (((u64)parm_list[idx + 3]) << 32) + + (((u64)parm_list[idx + 4]) << 24) + + (((u64)parm_list[idx + 5]) << 16) + + (((u64)parm_list[idx + 6]) << 8) + + ((u64)parm_list[idx + 7]); + + ns->mode_select_block_len = + (parm_list[idx + 12] << 24) + + (parm_list[idx + 13] << 16) + + (parm_list[idx + 14] << 8) + + (parm_list[idx + 15]); + } +} + +static u16 nvme_trans_modesel_get_mp(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *mode_page, u8 page_code) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + unsigned dword11; + + switch (page_code) { + case MODE_PAGE_CACHING: + dword11 = ((mode_page[2] & CACHING_MODE_PAGE_WCE_MASK) ? 1 : 0); + nvme_sc = nvme_set_features(dev, NVME_FEAT_VOLATILE_WC, dword11, + 0, NULL); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + break; + if (nvme_sc) { + res = nvme_sc; + break; + } + break; + case MODE_PAGE_CONTROL: + break; + case MODE_PAGE_POWER_CONDITION: + /* Verify the OS is not trying to set timers */ + if ((mode_page[2] & 0x01) != 0 || (mode_page[3] & 0x0F) != 0) { + res = nvme_trans_completion(hdr, + SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, + SCSI_ASC_INVALID_PARAMETER, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + if (!res) + res = SNTI_INTERNAL_ERROR; + break; + } + break; + default: + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + if (!res) + res = SNTI_INTERNAL_ERROR; + break; + } + + return res; +} + +static int nvme_trans_modesel_data(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd, u16 parm_list_len, u8 pf, + u8 sp, u8 cdb10) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u8 *parm_list; + u16 bd_len; + u8 llbaa = 0; + u16 index, saved_index; + u8 page_code; + u16 mp_size; + + /* Get parm list from data-in/out buffer */ + parm_list = kmalloc(parm_list_len, GFP_KERNEL); + if (parm_list == NULL) { + res = -ENOMEM; + goto out; + } + + res = nvme_trans_copy_from_user(hdr, parm_list, parm_list_len); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out_mem; + + nvme_trans_modesel_get_bd_len(parm_list, cdb10, &bd_len, &llbaa); + index = (cdb10) ? (MODE_SELECT_10_MPH_SIZE) : (MODE_SELECT_6_MPH_SIZE); + + if (bd_len != 0) { + /* Block Descriptors present, parse */ + nvme_trans_modesel_save_bd(ns, parm_list, index, bd_len, llbaa); + index += bd_len; + } + saved_index = index; + + /* Multiple mode pages may be present; iterate through all */ + /* In 1st Iteration, don't do NVME Command, only check for CDB errors */ + do { + page_code = parm_list[index] & MODE_SELECT_PAGE_CODE_MASK; + mp_size = parm_list[index + 1] + 2; + if ((page_code != MODE_PAGE_CACHING) && + (page_code != MODE_PAGE_CONTROL) && + (page_code != MODE_PAGE_POWER_CONDITION)) { + res = nvme_trans_completion(hdr, + SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, + SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out_mem; + } + index += mp_size; + } while (index < parm_list_len); + + /* In 2nd Iteration, do the NVME Commands */ + index = saved_index; + do { + page_code = parm_list[index] & MODE_SELECT_PAGE_CODE_MASK; + mp_size = parm_list[index + 1] + 2; + res = nvme_trans_modesel_get_mp(ns, hdr, &parm_list[index], + page_code); + if (res != SNTI_TRANSLATION_SUCCESS) + break; + index += mp_size; + } while (index < parm_list_len); + + out_mem: + kfree(parm_list); + out: + return res; +} + +/* Format Unit Helper Functions */ + +static int nvme_trans_fmt_set_blk_size_count(struct nvme_ns *ns, + struct sg_io_hdr *hdr) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ns *id_ns; + u8 flbas; + + /* + * SCSI Expects a MODE SELECT would have been issued prior to + * a FORMAT UNIT, and the block size and number would be used + * from the block descriptor in it. If a MODE SELECT had not + * been issued, FORMAT shall use the current values for both. + */ + + if (ns->mode_select_num_blocks == 0 || ns->mode_select_block_len == 0) { + mem = dma_alloc_coherent(&dev->pci_dev->dev, + sizeof(struct nvme_id_ns), &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out; + } + /* nvme ns identify */ + nvme_sc = nvme_identify(dev, ns->ns_id, 0, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_dma; + if (nvme_sc) { + res = nvme_sc; + goto out_dma; + } + id_ns = mem; + + if (ns->mode_select_num_blocks == 0) + ns->mode_select_num_blocks = le64_to_cpu(id_ns->ncap); + if (ns->mode_select_block_len == 0) { + flbas = (id_ns->flbas) & 0x0F; + ns->mode_select_block_len = + (1 << (id_ns->lbaf[flbas].ds)); + } + out_dma: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), + mem, dma_addr); + } + out: + return res; +} + +static int nvme_trans_fmt_get_parm_header(struct sg_io_hdr *hdr, u8 len, + u8 format_prot_info, u8 *nvme_pf_code) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u8 *parm_list; + u8 pf_usage, pf_code; + + parm_list = kmalloc(len, GFP_KERNEL); + if (parm_list == NULL) { + res = -ENOMEM; + goto out; + } + res = nvme_trans_copy_from_user(hdr, parm_list, len); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out_mem; + + if ((parm_list[FORMAT_UNIT_IMMED_OFFSET] & + FORMAT_UNIT_IMMED_MASK) != 0) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out_mem; + } + + if (len == FORMAT_UNIT_LONG_PARM_LIST_LEN && + (parm_list[FORMAT_UNIT_PROT_INT_OFFSET] & 0x0F) != 0) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out_mem; + } + pf_usage = parm_list[FORMAT_UNIT_PROT_FIELD_USAGE_OFFSET] & + FORMAT_UNIT_PROT_FIELD_USAGE_MASK; + pf_code = (pf_usage << 2) | format_prot_info; + switch (pf_code) { + case 0: + *nvme_pf_code = 0; + break; + case 2: + *nvme_pf_code = 1; + break; + case 3: + *nvme_pf_code = 2; + break; + case 7: + *nvme_pf_code = 3; + break; + default: + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + break; + } + + out_mem: + kfree(parm_list); + out: + return res; +} + +static int nvme_trans_fmt_send_cmd(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 prot_info) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ns *id_ns; + u8 i; + u8 flbas, nlbaf; + u8 selected_lbaf = 0xFF; + u32 cdw10 = 0; + struct nvme_command c; + + /* Loop thru LBAF's in id_ns to match reqd lbaf, put in cdw10 */ + mem = dma_alloc_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out; + } + /* nvme ns identify */ + nvme_sc = nvme_identify(dev, ns->ns_id, 0, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_dma; + if (nvme_sc) { + res = nvme_sc; + goto out_dma; + } + id_ns = mem; + flbas = (id_ns->flbas) & 0x0F; + nlbaf = id_ns->nlbaf; + + for (i = 0; i < nlbaf; i++) { + if (ns->mode_select_block_len == (1 << (id_ns->lbaf[i].ds))) { + selected_lbaf = i; + break; + } + } + if (selected_lbaf > 0x0F) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_PARAMETER, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + } + if (ns->mode_select_num_blocks != le64_to_cpu(id_ns->ncap)) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_PARAMETER, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + } + + cdw10 |= prot_info << 5; + cdw10 |= selected_lbaf & 0x0F; + memset(&c, 0, sizeof(c)); + c.format.opcode = nvme_admin_format_nvm; + c.format.nsid = cpu_to_le32(ns->ns_id); + c.format.cdw10 = cpu_to_le32(cdw10); + + nvme_sc = nvme_submit_admin_cmd(dev, &c, NULL); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_dma; + if (nvme_sc) + res = nvme_sc; + + out_dma: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), mem, + dma_addr); + out: + return res; +} + +/* Read/Write Helper Functions */ + +static inline void nvme_trans_get_io_cdb6(u8 *cmd, + struct nvme_trans_io_cdb *cdb_info) +{ + cdb_info->fua = 0; + cdb_info->prot_info = 0; + cdb_info->lba = GET_U32_FROM_CDB(cmd, IO_6_CDB_LBA_OFFSET) & + IO_6_CDB_LBA_MASK; + cdb_info->xfer_len = GET_U8_FROM_CDB(cmd, IO_6_CDB_TX_LEN_OFFSET); + + /* sbc3r27 sec 5.32 - TRANSFER LEN of 0 implies a 256 Block transfer */ + if (cdb_info->xfer_len == 0) + cdb_info->xfer_len = IO_6_DEFAULT_TX_LEN; +} + +static inline void nvme_trans_get_io_cdb10(u8 *cmd, + struct nvme_trans_io_cdb *cdb_info) +{ + cdb_info->fua = GET_U8_FROM_CDB(cmd, IO_10_CDB_FUA_OFFSET) & + IO_CDB_FUA_MASK; + cdb_info->prot_info = GET_U8_FROM_CDB(cmd, IO_10_CDB_WP_OFFSET) & + IO_CDB_WP_MASK >> IO_CDB_WP_SHIFT; + cdb_info->lba = GET_U32_FROM_CDB(cmd, IO_10_CDB_LBA_OFFSET); + cdb_info->xfer_len = GET_U16_FROM_CDB(cmd, IO_10_CDB_TX_LEN_OFFSET); +} + +static inline void nvme_trans_get_io_cdb12(u8 *cmd, + struct nvme_trans_io_cdb *cdb_info) +{ + cdb_info->fua = GET_U8_FROM_CDB(cmd, IO_12_CDB_FUA_OFFSET) & + IO_CDB_FUA_MASK; + cdb_info->prot_info = GET_U8_FROM_CDB(cmd, IO_12_CDB_WP_OFFSET) & + IO_CDB_WP_MASK >> IO_CDB_WP_SHIFT; + cdb_info->lba = GET_U32_FROM_CDB(cmd, IO_12_CDB_LBA_OFFSET); + cdb_info->xfer_len = GET_U32_FROM_CDB(cmd, IO_12_CDB_TX_LEN_OFFSET); +} + +static inline void nvme_trans_get_io_cdb16(u8 *cmd, + struct nvme_trans_io_cdb *cdb_info) +{ + cdb_info->fua = GET_U8_FROM_CDB(cmd, IO_16_CDB_FUA_OFFSET) & + IO_CDB_FUA_MASK; + cdb_info->prot_info = GET_U8_FROM_CDB(cmd, IO_16_CDB_WP_OFFSET) & + IO_CDB_WP_MASK >> IO_CDB_WP_SHIFT; + cdb_info->lba = GET_U64_FROM_CDB(cmd, IO_16_CDB_LBA_OFFSET); + cdb_info->xfer_len = GET_U32_FROM_CDB(cmd, IO_16_CDB_TX_LEN_OFFSET); +} + +static inline u32 nvme_trans_io_get_num_cmds(struct sg_io_hdr *hdr, + struct nvme_trans_io_cdb *cdb_info, + u32 max_blocks) +{ + /* If using iovecs, send one nvme command per vector */ + if (hdr->iovec_count > 0) + return hdr->iovec_count; + else if (cdb_info->xfer_len > max_blocks) + return ((cdb_info->xfer_len - 1) / max_blocks) + 1; + else + return 1; +} + +static u16 nvme_trans_io_get_control(struct nvme_ns *ns, + struct nvme_trans_io_cdb *cdb_info) +{ + u16 control = 0; + + /* When Protection information support is added, implement here */ + + if (cdb_info->fua > 0) + control |= NVME_RW_FUA; + + return control; +} + +static int nvme_trans_do_nvme_io(struct nvme_ns *ns, struct sg_io_hdr *hdr, + struct nvme_trans_io_cdb *cdb_info, u8 is_write) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_dev *dev = ns->dev; + struct nvme_queue *nvmeq; + u32 num_cmds; + struct nvme_iod *iod; + u64 unit_len; + u64 unit_num_blocks; /* Number of blocks to xfer in each nvme cmd */ + u32 retcode; + u32 i = 0; + u64 nvme_offset = 0; + void __user *next_mapping_addr; + struct nvme_command c; + u8 opcode = (is_write ? nvme_cmd_write : nvme_cmd_read); + u16 control; + u32 max_blocks = nvme_block_nr(ns, dev->max_hw_sectors); + + num_cmds = nvme_trans_io_get_num_cmds(hdr, cdb_info, max_blocks); + + /* + * This loop handles two cases. + * First, when an SGL is used in the form of an iovec list: + * - Use iov_base as the next mapping address for the nvme command_id + * - Use iov_len as the data transfer length for the command. + * Second, when we have a single buffer + * - If larger than max_blocks, split into chunks, offset + * each nvme command accordingly. + */ + for (i = 0; i < num_cmds; i++) { + memset(&c, 0, sizeof(c)); + if (hdr->iovec_count > 0) { + struct sg_iovec sgl; + + retcode = copy_from_user(&sgl, hdr->dxferp + + i * sizeof(struct sg_iovec), + sizeof(struct sg_iovec)); + if (retcode) + return -EFAULT; + unit_len = sgl.iov_len; + unit_num_blocks = unit_len >> ns->lba_shift; + next_mapping_addr = sgl.iov_base; + } else { + unit_num_blocks = min((u64)max_blocks, + (cdb_info->xfer_len - nvme_offset)); + unit_len = unit_num_blocks << ns->lba_shift; + next_mapping_addr = hdr->dxferp + + ((1 << ns->lba_shift) * nvme_offset); + } + + c.rw.opcode = opcode; + c.rw.nsid = cpu_to_le32(ns->ns_id); + c.rw.slba = cpu_to_le64(cdb_info->lba + nvme_offset); + c.rw.length = cpu_to_le16(unit_num_blocks - 1); + control = nvme_trans_io_get_control(ns, cdb_info); + c.rw.control = cpu_to_le16(control); + + iod = nvme_map_user_pages(dev, + (is_write) ? DMA_TO_DEVICE : DMA_FROM_DEVICE, + (unsigned long)next_mapping_addr, unit_len); + if (IS_ERR(iod)) { + res = PTR_ERR(iod); + goto out; + } + retcode = nvme_setup_prps(dev, &c.common, iod, unit_len, + GFP_KERNEL); + if (retcode != unit_len) { + nvme_unmap_user_pages(dev, + (is_write) ? DMA_TO_DEVICE : DMA_FROM_DEVICE, + iod); + nvme_free_iod(dev, iod); + res = -ENOMEM; + goto out; + } + + nvme_offset += unit_num_blocks; + + nvmeq = get_nvmeq(dev); + /* + * Since nvme_submit_sync_cmd sleeps, we can't keep + * preemption disabled. We may be preempted at any + * point, and be rescheduled to a different CPU. That + * will cause cacheline bouncing, but no additional + * races since q_lock already protects against other + * CPUs. + */ + put_nvmeq(nvmeq); + nvme_sc = nvme_submit_sync_cmd(nvmeq, &c, NULL, + NVME_IO_TIMEOUT); + if (nvme_sc != NVME_SC_SUCCESS) { + nvme_unmap_user_pages(dev, + (is_write) ? DMA_TO_DEVICE : DMA_FROM_DEVICE, + iod); + nvme_free_iod(dev, iod); + res = nvme_trans_status_code(hdr, nvme_sc); + goto out; + } + nvme_unmap_user_pages(dev, + (is_write) ? DMA_TO_DEVICE : DMA_FROM_DEVICE, + iod); + nvme_free_iod(dev, iod); + } + res = nvme_trans_status_code(hdr, NVME_SC_SUCCESS); + + out: + return res; +} + + +/* SCSI Command Translation Functions */ + +static int nvme_trans_io(struct nvme_ns *ns, struct sg_io_hdr *hdr, u8 is_write, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + struct nvme_trans_io_cdb cdb_info; + u8 opcode = cmd[0]; + u64 xfer_bytes; + u64 sum_iov_len = 0; + struct sg_iovec sgl; + int i; + size_t not_copied; + + /* Extract Fields from CDB */ + switch (opcode) { + case WRITE_6: + case READ_6: + nvme_trans_get_io_cdb6(cmd, &cdb_info); + break; + case WRITE_10: + case READ_10: + nvme_trans_get_io_cdb10(cmd, &cdb_info); + break; + case WRITE_12: + case READ_12: + nvme_trans_get_io_cdb12(cmd, &cdb_info); + break; + case WRITE_16: + case READ_16: + nvme_trans_get_io_cdb16(cmd, &cdb_info); + break; + default: + /* Will never really reach here */ + res = SNTI_INTERNAL_ERROR; + goto out; + } + + /* Calculate total length of transfer (in bytes) */ + if (hdr->iovec_count > 0) { + for (i = 0; i < hdr->iovec_count; i++) { + not_copied = copy_from_user(&sgl, hdr->dxferp + + i * sizeof(struct sg_iovec), + sizeof(struct sg_iovec)); + if (not_copied) + return -EFAULT; + sum_iov_len += sgl.iov_len; + /* IO vector sizes should be multiples of block size */ + if (sgl.iov_len % (1 << ns->lba_shift) != 0) { + res = nvme_trans_completion(hdr, + SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, + SCSI_ASC_INVALID_PARAMETER, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } + } + } else { + sum_iov_len = hdr->dxfer_len; + } + + /* As Per sg ioctl howto, if the lengths differ, use the lower one */ + xfer_bytes = min(((u64)hdr->dxfer_len), sum_iov_len); + + /* If block count and actual data buffer size dont match, error out */ + if (xfer_bytes != (cdb_info.xfer_len << ns->lba_shift)) { + res = -EINVAL; + goto out; + } + + /* Check for 0 length transfer - it is not illegal */ + if (cdb_info.xfer_len == 0) + goto out; + + /* Send NVMe IO Command(s) */ + res = nvme_trans_do_nvme_io(ns, hdr, &cdb_info, is_write); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + + out: + return res; +} + +static int nvme_trans_inquiry(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u8 evpd; + u8 page_code; + int alloc_len; + u8 *inq_response; + + evpd = GET_INQ_EVPD_BIT(cmd); + page_code = GET_INQ_PAGE_CODE(cmd); + alloc_len = GET_INQ_ALLOC_LENGTH(cmd); + + inq_response = kmalloc(STANDARD_INQUIRY_LENGTH, GFP_KERNEL); + if (inq_response == NULL) { + res = -ENOMEM; + goto out_mem; + } + + if (evpd == 0) { + if (page_code == INQ_STANDARD_INQUIRY_PAGE) { + res = nvme_trans_standard_inquiry_page(ns, hdr, + inq_response, alloc_len); + } else { + res = nvme_trans_completion(hdr, + SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, + SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + } + } else { + switch (page_code) { + case VPD_SUPPORTED_PAGES: + res = nvme_trans_supported_vpd_pages(ns, hdr, + inq_response, alloc_len); + break; + case VPD_SERIAL_NUMBER: + res = nvme_trans_unit_serial_page(ns, hdr, inq_response, + alloc_len); + break; + case VPD_DEVICE_IDENTIFIERS: + res = nvme_trans_device_id_page(ns, hdr, inq_response, + alloc_len); + break; + case VPD_EXTENDED_INQUIRY: + res = nvme_trans_ext_inq_page(ns, hdr, alloc_len); + break; + case VPD_BLOCK_DEV_CHARACTERISTICS: + res = nvme_trans_bdev_char_page(ns, hdr, alloc_len); + break; + default: + res = nvme_trans_completion(hdr, + SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, + SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + break; + } + } + kfree(inq_response); + out_mem: + return res; +} + +static int nvme_trans_log_sense(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u16 alloc_len; + u8 sp; + u8 pc; + u8 page_code; + + sp = GET_U8_FROM_CDB(cmd, LOG_SENSE_CDB_SP_OFFSET); + if (sp != LOG_SENSE_CDB_SP_NOT_ENABLED) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } + pc = GET_U8_FROM_CDB(cmd, LOG_SENSE_CDB_PC_OFFSET); + page_code = pc & LOG_SENSE_CDB_PAGE_CODE_MASK; + pc = (pc & LOG_SENSE_CDB_PC_MASK) >> LOG_SENSE_CDB_PC_SHIFT; + if (pc != LOG_SENSE_CDB_PC_CUMULATIVE_VALUES) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } + alloc_len = GET_U16_FROM_CDB(cmd, LOG_SENSE_CDB_ALLOC_LENGTH_OFFSET); + switch (page_code) { + case LOG_PAGE_SUPPORTED_LOG_PAGES_PAGE: + res = nvme_trans_log_supp_pages(ns, hdr, alloc_len); + break; + case LOG_PAGE_INFORMATIONAL_EXCEPTIONS_PAGE: + res = nvme_trans_log_info_exceptions(ns, hdr, alloc_len); + break; + case LOG_PAGE_TEMPERATURE_PAGE: + res = nvme_trans_log_temperature(ns, hdr, alloc_len); + break; + default: + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + break; + } + + out: + return res; +} + +static int nvme_trans_mode_select(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u8 cdb10 = 0; + u16 parm_list_len; + u8 page_format; + u8 save_pages; + + page_format = GET_U8_FROM_CDB(cmd, MODE_SELECT_CDB_PAGE_FORMAT_OFFSET); + page_format &= MODE_SELECT_CDB_PAGE_FORMAT_MASK; + + save_pages = GET_U8_FROM_CDB(cmd, MODE_SELECT_CDB_SAVE_PAGES_OFFSET); + save_pages &= MODE_SELECT_CDB_SAVE_PAGES_MASK; + + if (GET_OPCODE(cmd) == MODE_SELECT) { + parm_list_len = GET_U8_FROM_CDB(cmd, + MODE_SELECT_6_CDB_PARAM_LIST_LENGTH_OFFSET); + } else { + parm_list_len = GET_U16_FROM_CDB(cmd, + MODE_SELECT_10_CDB_PARAM_LIST_LENGTH_OFFSET); + cdb10 = 1; + } + + if (parm_list_len != 0) { + /* + * According to SPC-4 r24, a paramter list length field of 0 + * shall not be considered an error + */ + res = nvme_trans_modesel_data(ns, hdr, cmd, parm_list_len, + page_format, save_pages, cdb10); + } + + return res; +} + +static int nvme_trans_mode_sense(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u16 alloc_len; + u8 cdb10 = 0; + u8 page_code; + u8 pc; + + if (GET_OPCODE(cmd) == MODE_SENSE) { + alloc_len = GET_U8_FROM_CDB(cmd, MODE_SENSE6_ALLOC_LEN_OFFSET); + } else { + alloc_len = GET_U16_FROM_CDB(cmd, + MODE_SENSE10_ALLOC_LEN_OFFSET); + cdb10 = 1; + } + + pc = GET_U8_FROM_CDB(cmd, MODE_SENSE_PAGE_CONTROL_OFFSET) & + MODE_SENSE_PAGE_CONTROL_MASK; + if (pc != MODE_SENSE_PC_CURRENT_VALUES) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } + + page_code = GET_U8_FROM_CDB(cmd, MODE_SENSE_PAGE_CODE_OFFSET) & + MODE_SENSE_PAGE_CODE_MASK; + switch (page_code) { + case MODE_PAGE_CACHING: + res = nvme_trans_mode_page_create(ns, hdr, cmd, alloc_len, + cdb10, + &nvme_trans_fill_caching_page, + MODE_PAGE_CACHING_LEN); + break; + case MODE_PAGE_CONTROL: + res = nvme_trans_mode_page_create(ns, hdr, cmd, alloc_len, + cdb10, + &nvme_trans_fill_control_page, + MODE_PAGE_CONTROL_LEN); + break; + case MODE_PAGE_POWER_CONDITION: + res = nvme_trans_mode_page_create(ns, hdr, cmd, alloc_len, + cdb10, + &nvme_trans_fill_pow_cnd_page, + MODE_PAGE_POW_CND_LEN); + break; + case MODE_PAGE_INFO_EXCEP: + res = nvme_trans_mode_page_create(ns, hdr, cmd, alloc_len, + cdb10, + &nvme_trans_fill_inf_exc_page, + MODE_PAGE_INF_EXC_LEN); + break; + case MODE_PAGE_RETURN_ALL: + res = nvme_trans_mode_page_create(ns, hdr, cmd, alloc_len, + cdb10, + &nvme_trans_fill_all_pages, + MODE_PAGE_ALL_LEN); + break; + default: + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + break; + } + + out: + return res; +} + +static int nvme_trans_read_capacity(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + u32 alloc_len = READ_CAP_10_RESP_SIZE; + u32 resp_size = READ_CAP_10_RESP_SIZE; + u32 xfer_len; + u8 cdb16; + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ns *id_ns; + u8 *response; + + cdb16 = IS_READ_CAP_16(cmd); + if (cdb16) { + alloc_len = GET_READ_CAP_16_ALLOC_LENGTH(cmd); + resp_size = READ_CAP_16_RESP_SIZE; + } + + mem = dma_alloc_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out; + } + /* nvme ns identify */ + nvme_sc = nvme_identify(dev, ns->ns_id, 0, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_dma; + if (nvme_sc) { + res = nvme_sc; + goto out_dma; + } + id_ns = mem; + + response = kmalloc(resp_size, GFP_KERNEL); + if (response == NULL) { + res = -ENOMEM; + goto out_dma; + } + memset(response, 0, resp_size); + nvme_trans_fill_read_cap(response, id_ns, cdb16); + + xfer_len = min(alloc_len, resp_size); + res = nvme_trans_copy_to_user(hdr, response, xfer_len); + + kfree(response); + out_dma: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ns), mem, + dma_addr); + out: + return res; +} + +static int nvme_trans_report_luns(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + u32 alloc_len, xfer_len, resp_size; + u8 select_report; + u8 *response; + struct nvme_dev *dev = ns->dev; + dma_addr_t dma_addr; + void *mem; + struct nvme_id_ctrl *id_ctrl; + u32 ll_length, lun_id; + u8 lun_id_offset = REPORT_LUNS_FIRST_LUN_OFFSET; + __be32 tmp_len; + + alloc_len = GET_REPORT_LUNS_ALLOC_LENGTH(cmd); + select_report = GET_U8_FROM_CDB(cmd, REPORT_LUNS_SR_OFFSET); + + if ((select_report != ALL_LUNS_RETURNED) && + (select_report != ALL_WELL_KNOWN_LUNS_RETURNED) && + (select_report != RESTRICTED_LUNS_RETURNED)) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } else { + /* NVMe Controller Identify */ + mem = dma_alloc_coherent(&dev->pci_dev->dev, + sizeof(struct nvme_id_ctrl), + &dma_addr, GFP_KERNEL); + if (mem == NULL) { + res = -ENOMEM; + goto out; + } + nvme_sc = nvme_identify(dev, 0, 1, dma_addr); + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out_dma; + if (nvme_sc) { + res = nvme_sc; + goto out_dma; + } + id_ctrl = mem; + ll_length = le32_to_cpu(id_ctrl->nn) * LUN_ENTRY_SIZE; + resp_size = ll_length + LUN_DATA_HEADER_SIZE; + + if (alloc_len < resp_size) { + res = nvme_trans_completion(hdr, + SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out_dma; + } + + response = kmalloc(resp_size, GFP_KERNEL); + if (response == NULL) { + res = -ENOMEM; + goto out_dma; + } + memset(response, 0, resp_size); + + /* The first LUN ID will always be 0 per the SAM spec */ + for (lun_id = 0; lun_id < le32_to_cpu(id_ctrl->nn); lun_id++) { + /* + * Set the LUN Id and then increment to the next LUN + * location in the parameter data. + */ + __be64 tmp_id = cpu_to_be64(lun_id); + memcpy(&response[lun_id_offset], &tmp_id, sizeof(u64)); + lun_id_offset += LUN_ENTRY_SIZE; + } + tmp_len = cpu_to_be32(ll_length); + memcpy(response, &tmp_len, sizeof(u32)); + } + + xfer_len = min(alloc_len, resp_size); + res = nvme_trans_copy_to_user(hdr, response, xfer_len); + + kfree(response); + out_dma: + dma_free_coherent(&dev->pci_dev->dev, sizeof(struct nvme_id_ctrl), mem, + dma_addr); + out: + return res; +} + +static int nvme_trans_request_sense(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u8 alloc_len, xfer_len, resp_size; + u8 desc_format; + u8 *response; + + alloc_len = GET_REQUEST_SENSE_ALLOC_LENGTH(cmd); + desc_format = GET_U8_FROM_CDB(cmd, REQUEST_SENSE_DESC_OFFSET); + desc_format &= REQUEST_SENSE_DESC_MASK; + + resp_size = ((desc_format) ? (DESC_FMT_SENSE_DATA_SIZE) : + (FIXED_FMT_SENSE_DATA_SIZE)); + response = kmalloc(resp_size, GFP_KERNEL); + if (response == NULL) { + res = -ENOMEM; + goto out; + } + memset(response, 0, resp_size); + + if (desc_format == DESCRIPTOR_FORMAT_SENSE_DATA_TYPE) { + /* Descriptor Format Sense Data */ + response[0] = DESC_FORMAT_SENSE_DATA; + response[1] = NO_SENSE; + /* TODO How is LOW POWER CONDITION ON handled? (byte 2) */ + response[2] = SCSI_ASC_NO_SENSE; + response[3] = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + /* SDAT_OVFL = 0 | Additional Sense Length = 0 */ + } else { + /* Fixed Format Sense Data */ + response[0] = FIXED_SENSE_DATA; + /* Byte 1 = Obsolete */ + response[2] = NO_SENSE; /* FM, EOM, ILI, SDAT_OVFL = 0 */ + /* Bytes 3-6 - Information - set to zero */ + response[7] = FIXED_SENSE_DATA_ADD_LENGTH; + /* Bytes 8-11 - Cmd Specific Information - set to zero */ + response[12] = SCSI_ASC_NO_SENSE; + response[13] = SCSI_ASCQ_CAUSE_NOT_REPORTABLE; + /* Byte 14 = Field Replaceable Unit Code = 0 */ + /* Bytes 15-17 - SKSV=0; Sense Key Specific = 0 */ + } + + xfer_len = min(alloc_len, resp_size); + res = nvme_trans_copy_to_user(hdr, response, xfer_len); + + kfree(response); + out: + return res; +} + +static int nvme_trans_security_protocol(struct nvme_ns *ns, + struct sg_io_hdr *hdr, + u8 *cmd) +{ + return nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_ILLEGAL_COMMAND, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); +} + +static int nvme_trans_start_stop(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_queue *nvmeq; + struct nvme_command c; + u8 immed, pcmod, pc, no_flush, start; + + immed = GET_U8_FROM_CDB(cmd, START_STOP_UNIT_CDB_IMMED_OFFSET); + pcmod = GET_U8_FROM_CDB(cmd, START_STOP_UNIT_CDB_POWER_COND_MOD_OFFSET); + pc = GET_U8_FROM_CDB(cmd, START_STOP_UNIT_CDB_POWER_COND_OFFSET); + no_flush = GET_U8_FROM_CDB(cmd, START_STOP_UNIT_CDB_NO_FLUSH_OFFSET); + start = GET_U8_FROM_CDB(cmd, START_STOP_UNIT_CDB_START_OFFSET); + + immed &= START_STOP_UNIT_CDB_IMMED_MASK; + pcmod &= START_STOP_UNIT_CDB_POWER_COND_MOD_MASK; + pc = (pc & START_STOP_UNIT_CDB_POWER_COND_MASK) >> NIBBLE_SHIFT; + no_flush &= START_STOP_UNIT_CDB_NO_FLUSH_MASK; + start &= START_STOP_UNIT_CDB_START_MASK; + + if (immed != 0) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + } else { + if (no_flush == 0) { + /* Issue NVME FLUSH command prior to START STOP UNIT */ + memset(&c, 0, sizeof(c)); + c.common.opcode = nvme_cmd_flush; + c.common.nsid = cpu_to_le32(ns->ns_id); + + nvmeq = get_nvmeq(ns->dev); + put_nvmeq(nvmeq); + nvme_sc = nvme_submit_sync_cmd(nvmeq, &c, NULL, NVME_IO_TIMEOUT); + + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out; + if (nvme_sc) { + res = nvme_sc; + goto out; + } + } + /* Setup the expected power state transition */ + res = nvme_trans_power_state(ns, hdr, pc, pcmod, start); + } + + out: + return res; +} + +static int nvme_trans_synchronize_cache(struct nvme_ns *ns, + struct sg_io_hdr *hdr, u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + int nvme_sc; + struct nvme_command c; + struct nvme_queue *nvmeq; + + memset(&c, 0, sizeof(c)); + c.common.opcode = nvme_cmd_flush; + c.common.nsid = cpu_to_le32(ns->ns_id); + + nvmeq = get_nvmeq(ns->dev); + put_nvmeq(nvmeq); + nvme_sc = nvme_submit_sync_cmd(nvmeq, &c, NULL, NVME_IO_TIMEOUT); + + res = nvme_trans_status_code(hdr, nvme_sc); + if (res) + goto out; + if (nvme_sc) + res = nvme_sc; + + out: + return res; +} + +static int nvme_trans_format_unit(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u8 parm_hdr_len = 0; + u8 nvme_pf_code = 0; + u8 format_prot_info, long_list, format_data; + + format_prot_info = GET_U8_FROM_CDB(cmd, + FORMAT_UNIT_CDB_FORMAT_PROT_INFO_OFFSET); + long_list = GET_U8_FROM_CDB(cmd, FORMAT_UNIT_CDB_LONG_LIST_OFFSET); + format_data = GET_U8_FROM_CDB(cmd, FORMAT_UNIT_CDB_FORMAT_DATA_OFFSET); + + format_prot_info = (format_prot_info & + FORMAT_UNIT_CDB_FORMAT_PROT_INFO_MASK) >> + FORMAT_UNIT_CDB_FORMAT_PROT_INFO_SHIFT; + long_list &= FORMAT_UNIT_CDB_LONG_LIST_MASK; + format_data &= FORMAT_UNIT_CDB_FORMAT_DATA_MASK; + + if (format_data != 0) { + if (format_prot_info != 0) { + if (long_list == 0) + parm_hdr_len = FORMAT_UNIT_SHORT_PARM_LIST_LEN; + else + parm_hdr_len = FORMAT_UNIT_LONG_PARM_LIST_LEN; + } + } else if (format_data == 0 && format_prot_info != 0) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } + + /* Get parm header from data-in/out buffer */ + /* + * According to the translation spec, the only fields in the parameter + * list we are concerned with are in the header. So allocate only that. + */ + if (parm_hdr_len > 0) { + res = nvme_trans_fmt_get_parm_header(hdr, parm_hdr_len, + format_prot_info, &nvme_pf_code); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + } + + /* Attempt to activate any previously downloaded firmware image */ + res = nvme_trans_send_fw_cmd(ns, hdr, nvme_admin_activate_fw, 0, 0, 0); + + /* Determine Block size and count and send format command */ + res = nvme_trans_fmt_set_blk_size_count(ns, hdr); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + + res = nvme_trans_fmt_send_cmd(ns, hdr, nvme_pf_code); + + out: + return res; +} + +static int nvme_trans_test_unit_ready(struct nvme_ns *ns, + struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + struct nvme_dev *dev = ns->dev; + + if (!(readl(&dev->bar->csts) & NVME_CSTS_RDY)) + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + NOT_READY, SCSI_ASC_LUN_NOT_READY, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + else + res = nvme_trans_completion(hdr, SAM_STAT_GOOD, NO_SENSE, 0, 0); + + return res; +} + +static int nvme_trans_write_buffer(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + int res = SNTI_TRANSLATION_SUCCESS; + u32 buffer_offset, parm_list_length; + u8 buffer_id, mode; + + parm_list_length = + GET_U24_FROM_CDB(cmd, WRITE_BUFFER_CDB_PARM_LIST_LENGTH_OFFSET); + if (parm_list_length % BYTES_TO_DWORDS != 0) { + /* NVMe expects Firmware file to be a whole number of DWORDS */ + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } + buffer_id = GET_U8_FROM_CDB(cmd, WRITE_BUFFER_CDB_BUFFER_ID_OFFSET); + if (buffer_id > NVME_MAX_FIRMWARE_SLOT) { + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + goto out; + } + mode = GET_U8_FROM_CDB(cmd, WRITE_BUFFER_CDB_MODE_OFFSET) & + WRITE_BUFFER_CDB_MODE_MASK; + buffer_offset = + GET_U24_FROM_CDB(cmd, WRITE_BUFFER_CDB_BUFFER_OFFSET_OFFSET); + + switch (mode) { + case DOWNLOAD_SAVE_ACTIVATE: + res = nvme_trans_send_fw_cmd(ns, hdr, nvme_admin_download_fw, + parm_list_length, buffer_offset, + buffer_id); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + res = nvme_trans_send_fw_cmd(ns, hdr, nvme_admin_activate_fw, + parm_list_length, buffer_offset, + buffer_id); + break; + case DOWNLOAD_SAVE_DEFER_ACTIVATE: + res = nvme_trans_send_fw_cmd(ns, hdr, nvme_admin_download_fw, + parm_list_length, buffer_offset, + buffer_id); + break; + case ACTIVATE_DEFERRED_MICROCODE: + res = nvme_trans_send_fw_cmd(ns, hdr, nvme_admin_activate_fw, + parm_list_length, buffer_offset, + buffer_id); + break; + default: + res = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_INVALID_CDB, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + break; + } + + out: + return res; +} + +struct scsi_unmap_blk_desc { + __be64 slba; + __be32 nlb; + u32 resv; +}; + +struct scsi_unmap_parm_list { + __be16 unmap_data_len; + __be16 unmap_blk_desc_data_len; + u32 resv; + struct scsi_unmap_blk_desc desc[0]; +}; + +static int nvme_trans_unmap(struct nvme_ns *ns, struct sg_io_hdr *hdr, + u8 *cmd) +{ + struct nvme_dev *dev = ns->dev; + struct scsi_unmap_parm_list *plist; + struct nvme_dsm_range *range; + struct nvme_queue *nvmeq; + struct nvme_command c; + int i, nvme_sc, res = -ENOMEM; + u16 ndesc, list_len; + dma_addr_t dma_addr; + + list_len = GET_U16_FROM_CDB(cmd, UNMAP_CDB_PARAM_LIST_LENGTH_OFFSET); + if (!list_len) + return -EINVAL; + + plist = kmalloc(list_len, GFP_KERNEL); + if (!plist) + return -ENOMEM; + + res = nvme_trans_copy_from_user(hdr, plist, list_len); + if (res != SNTI_TRANSLATION_SUCCESS) + goto out; + + ndesc = be16_to_cpu(plist->unmap_blk_desc_data_len) >> 4; + if (!ndesc || ndesc > 256) { + res = -EINVAL; + goto out; + } + + range = dma_alloc_coherent(&dev->pci_dev->dev, ndesc * sizeof(*range), + &dma_addr, GFP_KERNEL); + if (!range) + goto out; + + for (i = 0; i < ndesc; i++) { + range[i].nlb = cpu_to_le32(be32_to_cpu(plist->desc[i].nlb)); + range[i].slba = cpu_to_le64(be64_to_cpu(plist->desc[i].slba)); + range[i].cattr = 0; + } + + memset(&c, 0, sizeof(c)); + c.dsm.opcode = nvme_cmd_dsm; + c.dsm.nsid = cpu_to_le32(ns->ns_id); + c.dsm.prp1 = cpu_to_le64(dma_addr); + c.dsm.nr = cpu_to_le32(ndesc - 1); + c.dsm.attributes = cpu_to_le32(NVME_DSMGMT_AD); + + nvmeq = get_nvmeq(dev); + put_nvmeq(nvmeq); + + nvme_sc = nvme_submit_sync_cmd(nvmeq, &c, NULL, NVME_IO_TIMEOUT); + res = nvme_trans_status_code(hdr, nvme_sc); + + dma_free_coherent(&dev->pci_dev->dev, ndesc * sizeof(*range), + range, dma_addr); + out: + kfree(plist); + return res; +} + +static int nvme_scsi_translate(struct nvme_ns *ns, struct sg_io_hdr *hdr) +{ + u8 cmd[BLK_MAX_CDB]; + int retcode; + unsigned int opcode; + + if (hdr->cmdp == NULL) + return -EMSGSIZE; + if (copy_from_user(cmd, hdr->cmdp, hdr->cmd_len)) + return -EFAULT; + + opcode = cmd[0]; + + switch (opcode) { + case READ_6: + case READ_10: + case READ_12: + case READ_16: + retcode = nvme_trans_io(ns, hdr, 0, cmd); + break; + case WRITE_6: + case WRITE_10: + case WRITE_12: + case WRITE_16: + retcode = nvme_trans_io(ns, hdr, 1, cmd); + break; + case INQUIRY: + retcode = nvme_trans_inquiry(ns, hdr, cmd); + break; + case LOG_SENSE: + retcode = nvme_trans_log_sense(ns, hdr, cmd); + break; + case MODE_SELECT: + case MODE_SELECT_10: + retcode = nvme_trans_mode_select(ns, hdr, cmd); + break; + case MODE_SENSE: + case MODE_SENSE_10: + retcode = nvme_trans_mode_sense(ns, hdr, cmd); + break; + case READ_CAPACITY: + retcode = nvme_trans_read_capacity(ns, hdr, cmd); + break; + case SERVICE_ACTION_IN: + if (IS_READ_CAP_16(cmd)) + retcode = nvme_trans_read_capacity(ns, hdr, cmd); + else + goto out; + break; + case REPORT_LUNS: + retcode = nvme_trans_report_luns(ns, hdr, cmd); + break; + case REQUEST_SENSE: + retcode = nvme_trans_request_sense(ns, hdr, cmd); + break; + case SECURITY_PROTOCOL_IN: + case SECURITY_PROTOCOL_OUT: + retcode = nvme_trans_security_protocol(ns, hdr, cmd); + break; + case START_STOP: + retcode = nvme_trans_start_stop(ns, hdr, cmd); + break; + case SYNCHRONIZE_CACHE: + retcode = nvme_trans_synchronize_cache(ns, hdr, cmd); + break; + case FORMAT_UNIT: + retcode = nvme_trans_format_unit(ns, hdr, cmd); + break; + case TEST_UNIT_READY: + retcode = nvme_trans_test_unit_ready(ns, hdr, cmd); + break; + case WRITE_BUFFER: + retcode = nvme_trans_write_buffer(ns, hdr, cmd); + break; + case UNMAP: + retcode = nvme_trans_unmap(ns, hdr, cmd); + break; + default: + out: + retcode = nvme_trans_completion(hdr, SAM_STAT_CHECK_CONDITION, + ILLEGAL_REQUEST, SCSI_ASC_ILLEGAL_COMMAND, + SCSI_ASCQ_CAUSE_NOT_REPORTABLE); + break; + } + return retcode; +} + +int nvme_sg_io(struct nvme_ns *ns, struct sg_io_hdr __user *u_hdr) +{ + struct sg_io_hdr hdr; + int retcode; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + if (copy_from_user(&hdr, u_hdr, sizeof(hdr))) + return -EFAULT; + if (hdr.interface_id != 'S') + return -EINVAL; + if (hdr.cmd_len > BLK_MAX_CDB) + return -EINVAL; + + retcode = nvme_scsi_translate(ns, &hdr); + if (retcode < 0) + return retcode; + if (retcode > 0) + retcode = SNTI_TRANSLATION_SUCCESS; + if (copy_to_user(u_hdr, &hdr, sizeof(sg_io_hdr_t)) > 0) + return -EFAULT; + + return retcode; +} + +int nvme_sg_get_version_num(int __user *ip) +{ + return put_user(sg_version_num, ip); +} diff --git a/drivers/block/nvme.c b/drivers/block/nvme.c deleted file mode 100644 index 9dcefe40380b..000000000000 --- a/drivers/block/nvme.c +++ /dev/null @@ -1,1807 +0,0 @@ -/* - * NVM Express device driver - * Copyright (c) 2011, Intel Corporation. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms and conditions of the GNU General Public License, - * version 2, as published by the Free Software Foundation. - * - * This program is distributed in the hope it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for - * more details. - * - * You should have received a copy of the GNU General Public License along with - * this program; if not, write to the Free Software Foundation, Inc., - * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -#define NVME_Q_DEPTH 1024 -#define SQ_SIZE(depth) (depth * sizeof(struct nvme_command)) -#define CQ_SIZE(depth) (depth * sizeof(struct nvme_completion)) -#define NVME_MINORS 64 -#define NVME_IO_TIMEOUT (5 * HZ) -#define ADMIN_TIMEOUT (60 * HZ) - -static int nvme_major; -module_param(nvme_major, int, 0); - -static int use_threaded_interrupts; -module_param(use_threaded_interrupts, int, 0); - -static DEFINE_SPINLOCK(dev_list_lock); -static LIST_HEAD(dev_list); -static struct task_struct *nvme_thread; - -/* - * Represents an NVM Express device. Each nvme_dev is a PCI function. - */ -struct nvme_dev { - struct list_head node; - struct nvme_queue **queues; - u32 __iomem *dbs; - struct pci_dev *pci_dev; - struct dma_pool *prp_page_pool; - struct dma_pool *prp_small_pool; - int instance; - int queue_count; - int db_stride; - u32 ctrl_config; - struct msix_entry *entry; - struct nvme_bar __iomem *bar; - struct list_head namespaces; - char serial[20]; - char model[40]; - char firmware_rev[8]; - u32 max_hw_sectors; -}; - -/* - * An NVM Express namespace is equivalent to a SCSI LUN - */ -struct nvme_ns { - struct list_head list; - - struct nvme_dev *dev; - struct request_queue *queue; - struct gendisk *disk; - - int ns_id; - int lba_shift; -}; - -/* - * An NVM Express queue. Each device has at least two (one for admin - * commands and one for I/O commands). - */ -struct nvme_queue { - struct device *q_dmadev; - struct nvme_dev *dev; - spinlock_t q_lock; - struct nvme_command *sq_cmds; - volatile struct nvme_completion *cqes; - dma_addr_t sq_dma_addr; - dma_addr_t cq_dma_addr; - wait_queue_head_t sq_full; - wait_queue_t sq_cong_wait; - struct bio_list sq_cong; - u32 __iomem *q_db; - u16 q_depth; - u16 cq_vector; - u16 sq_head; - u16 sq_tail; - u16 cq_head; - u16 cq_phase; - unsigned long cmdid_data[]; -}; - -/* - * Check we didin't inadvertently grow the command struct - */ -static inline void _nvme_check_size(void) -{ - BUILD_BUG_ON(sizeof(struct nvme_rw_command) != 64); - BUILD_BUG_ON(sizeof(struct nvme_create_cq) != 64); - BUILD_BUG_ON(sizeof(struct nvme_create_sq) != 64); - BUILD_BUG_ON(sizeof(struct nvme_delete_queue) != 64); - BUILD_BUG_ON(sizeof(struct nvme_features) != 64); - BUILD_BUG_ON(sizeof(struct nvme_command) != 64); - BUILD_BUG_ON(sizeof(struct nvme_id_ctrl) != 4096); - BUILD_BUG_ON(sizeof(struct nvme_id_ns) != 4096); - BUILD_BUG_ON(sizeof(struct nvme_lba_range_type) != 64); - BUILD_BUG_ON(sizeof(struct nvme_smart_log) != 512); -} - -typedef void (*nvme_completion_fn)(struct nvme_dev *, void *, - struct nvme_completion *); - -struct nvme_cmd_info { - nvme_completion_fn fn; - void *ctx; - unsigned long timeout; -}; - -static struct nvme_cmd_info *nvme_cmd_info(struct nvme_queue *nvmeq) -{ - return (void *)&nvmeq->cmdid_data[BITS_TO_LONGS(nvmeq->q_depth)]; -} - -/** - * alloc_cmdid() - Allocate a Command ID - * @nvmeq: The queue that will be used for this command - * @ctx: A pointer that will be passed to the handler - * @handler: The function to call on completion - * - * Allocate a Command ID for a queue. The data passed in will - * be passed to the completion handler. This is implemented by using - * the bottom two bits of the ctx pointer to store the handler ID. - * Passing in a pointer that's not 4-byte aligned will cause a BUG. - * We can change this if it becomes a problem. - * - * May be called with local interrupts disabled and the q_lock held, - * or with interrupts enabled and no locks held. - */ -static int alloc_cmdid(struct nvme_queue *nvmeq, void *ctx, - nvme_completion_fn handler, unsigned timeout) -{ - int depth = nvmeq->q_depth - 1; - struct nvme_cmd_info *info = nvme_cmd_info(nvmeq); - int cmdid; - - do { - cmdid = find_first_zero_bit(nvmeq->cmdid_data, depth); - if (cmdid >= depth) - return -EBUSY; - } while (test_and_set_bit(cmdid, nvmeq->cmdid_data)); - - info[cmdid].fn = handler; - info[cmdid].ctx = ctx; - info[cmdid].timeout = jiffies + timeout; - return cmdid; -} - -static int alloc_cmdid_killable(struct nvme_queue *nvmeq, void *ctx, - nvme_completion_fn handler, unsigned timeout) -{ - int cmdid; - wait_event_killable(nvmeq->sq_full, - (cmdid = alloc_cmdid(nvmeq, ctx, handler, timeout)) >= 0); - return (cmdid < 0) ? -EINTR : cmdid; -} - -/* Special values must be less than 0x1000 */ -#define CMD_CTX_BASE ((void *)POISON_POINTER_DELTA) -#define CMD_CTX_CANCELLED (0x30C + CMD_CTX_BASE) -#define CMD_CTX_COMPLETED (0x310 + CMD_CTX_BASE) -#define CMD_CTX_INVALID (0x314 + CMD_CTX_BASE) -#define CMD_CTX_FLUSH (0x318 + CMD_CTX_BASE) - -static void special_completion(struct nvme_dev *dev, void *ctx, - struct nvme_completion *cqe) -{ - if (ctx == CMD_CTX_CANCELLED) - return; - if (ctx == CMD_CTX_FLUSH) - return; - if (ctx == CMD_CTX_COMPLETED) { - dev_warn(&dev->pci_dev->dev, - "completed id %d twice on queue %d\n", - cqe->command_id, le16_to_cpup(&cqe->sq_id)); - return; - } - if (ctx == CMD_CTX_INVALID) { - dev_warn(&dev->pci_dev->dev, - "invalid id %d completed on queue %d\n", - cqe->command_id, le16_to_cpup(&cqe->sq_id)); - return; - } - - dev_warn(&dev->pci_dev->dev, "Unknown special completion %p\n", ctx); -} - -/* - * Called with local interrupts disabled and the q_lock held. May not sleep. - */ -static void *free_cmdid(struct nvme_queue *nvmeq, int cmdid, - nvme_completion_fn *fn) -{ - void *ctx; - struct nvme_cmd_info *info = nvme_cmd_info(nvmeq); - - if (cmdid >= nvmeq->q_depth) { - *fn = special_completion; - return CMD_CTX_INVALID; - } - if (fn) - *fn = info[cmdid].fn; - ctx = info[cmdid].ctx; - info[cmdid].fn = special_completion; - info[cmdid].ctx = CMD_CTX_COMPLETED; - clear_bit(cmdid, nvmeq->cmdid_data); - wake_up(&nvmeq->sq_full); - return ctx; -} - -static void *cancel_cmdid(struct nvme_queue *nvmeq, int cmdid, - nvme_completion_fn *fn) -{ - void *ctx; - struct nvme_cmd_info *info = nvme_cmd_info(nvmeq); - if (fn) - *fn = info[cmdid].fn; - ctx = info[cmdid].ctx; - info[cmdid].fn = special_completion; - info[cmdid].ctx = CMD_CTX_CANCELLED; - return ctx; -} - -static struct nvme_queue *get_nvmeq(struct nvme_dev *dev) -{ - return dev->queues[get_cpu() + 1]; -} - -static void put_nvmeq(struct nvme_queue *nvmeq) -{ - put_cpu(); -} - -/** - * nvme_submit_cmd() - Copy a command into a queue and ring the doorbell - * @nvmeq: The queue to use - * @cmd: The command to send - * - * Safe to use from interrupt context - */ -static int nvme_submit_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd) -{ - unsigned long flags; - u16 tail; - spin_lock_irqsave(&nvmeq->q_lock, flags); - tail = nvmeq->sq_tail; - memcpy(&nvmeq->sq_cmds[tail], cmd, sizeof(*cmd)); - if (++tail == nvmeq->q_depth) - tail = 0; - writel(tail, nvmeq->q_db); - nvmeq->sq_tail = tail; - spin_unlock_irqrestore(&nvmeq->q_lock, flags); - - return 0; -} - -/* - * The nvme_iod describes the data in an I/O, including the list of PRP - * entries. You can't see it in this data structure because C doesn't let - * me express that. Use nvme_alloc_iod to ensure there's enough space - * allocated to store the PRP list. - */ -struct nvme_iod { - void *private; /* For the use of the submitter of the I/O */ - int npages; /* In the PRP list. 0 means small pool in use */ - int offset; /* Of PRP list */ - int nents; /* Used in scatterlist */ - int length; /* Of data, in bytes */ - dma_addr_t first_dma; - struct scatterlist sg[0]; -}; - -static __le64 **iod_list(struct nvme_iod *iod) -{ - return ((void *)iod) + iod->offset; -} - -/* - * Will slightly overestimate the number of pages needed. This is OK - * as it only leads to a small amount of wasted memory for the lifetime of - * the I/O. - */ -static int nvme_npages(unsigned size) -{ - unsigned nprps = DIV_ROUND_UP(size + PAGE_SIZE, PAGE_SIZE); - return DIV_ROUND_UP(8 * nprps, PAGE_SIZE - 8); -} - -static struct nvme_iod * -nvme_alloc_iod(unsigned nseg, unsigned nbytes, gfp_t gfp) -{ - struct nvme_iod *iod = kmalloc(sizeof(struct nvme_iod) + - sizeof(__le64 *) * nvme_npages(nbytes) + - sizeof(struct scatterlist) * nseg, gfp); - - if (iod) { - iod->offset = offsetof(struct nvme_iod, sg[nseg]); - iod->npages = -1; - iod->length = nbytes; - iod->nents = 0; - } - - return iod; -} - -static void nvme_free_iod(struct nvme_dev *dev, struct nvme_iod *iod) -{ - const int last_prp = PAGE_SIZE / 8 - 1; - int i; - __le64 **list = iod_list(iod); - dma_addr_t prp_dma = iod->first_dma; - - if (iod->npages == 0) - dma_pool_free(dev->prp_small_pool, list[0], prp_dma); - for (i = 0; i < iod->npages; i++) { - __le64 *prp_list = list[i]; - dma_addr_t next_prp_dma = le64_to_cpu(prp_list[last_prp]); - dma_pool_free(dev->prp_page_pool, prp_list, prp_dma); - prp_dma = next_prp_dma; - } - kfree(iod); -} - -static void requeue_bio(struct nvme_dev *dev, struct bio *bio) -{ - struct nvme_queue *nvmeq = get_nvmeq(dev); - if (bio_list_empty(&nvmeq->sq_cong)) - add_wait_queue(&nvmeq->sq_full, &nvmeq->sq_cong_wait); - bio_list_add(&nvmeq->sq_cong, bio); - put_nvmeq(nvmeq); - wake_up_process(nvme_thread); -} - -static void bio_completion(struct nvme_dev *dev, void *ctx, - struct nvme_completion *cqe) -{ - struct nvme_iod *iod = ctx; - struct bio *bio = iod->private; - u16 status = le16_to_cpup(&cqe->status) >> 1; - - if (iod->nents) - dma_unmap_sg(&dev->pci_dev->dev, iod->sg, iod->nents, - bio_data_dir(bio) ? DMA_TO_DEVICE : DMA_FROM_DEVICE); - nvme_free_iod(dev, iod); - if (status) { - bio_endio(bio, -EIO); - } else if (bio->bi_vcnt > bio->bi_idx) { - requeue_bio(dev, bio); - } else { - bio_endio(bio, 0); - } -} - -/* length is in bytes. gfp flags indicates whether we may sleep. */ -static int nvme_setup_prps(struct nvme_dev *dev, - struct nvme_common_command *cmd, struct nvme_iod *iod, - int total_len, gfp_t gfp) -{ - struct dma_pool *pool; - int length = total_len; - struct scatterlist *sg = iod->sg; - int dma_len = sg_dma_len(sg); - u64 dma_addr = sg_dma_address(sg); - int offset = offset_in_page(dma_addr); - __le64 *prp_list; - __le64 **list = iod_list(iod); - dma_addr_t prp_dma; - int nprps, i; - - cmd->prp1 = cpu_to_le64(dma_addr); - length -= (PAGE_SIZE - offset); - if (length <= 0) - return total_len; - - dma_len -= (PAGE_SIZE - offset); - if (dma_len) { - dma_addr += (PAGE_SIZE - offset); - } else { - sg = sg_next(sg); - dma_addr = sg_dma_address(sg); - dma_len = sg_dma_len(sg); - } - - if (length <= PAGE_SIZE) { - cmd->prp2 = cpu_to_le64(dma_addr); - return total_len; - } - - nprps = DIV_ROUND_UP(length, PAGE_SIZE); - if (nprps <= (256 / 8)) { - pool = dev->prp_small_pool; - iod->npages = 0; - } else { - pool = dev->prp_page_pool; - iod->npages = 1; - } - - prp_list = dma_pool_alloc(pool, gfp, &prp_dma); - if (!prp_list) { - cmd->prp2 = cpu_to_le64(dma_addr); - iod->npages = -1; - return (total_len - length) + PAGE_SIZE; - } - list[0] = prp_list; - iod->first_dma = prp_dma; - cmd->prp2 = cpu_to_le64(prp_dma); - i = 0; - for (;;) { - if (i == PAGE_SIZE / 8) { - __le64 *old_prp_list = prp_list; - prp_list = dma_pool_alloc(pool, gfp, &prp_dma); - if (!prp_list) - return total_len - length; - list[iod->npages++] = prp_list; - prp_list[0] = old_prp_list[i - 1]; - old_prp_list[i - 1] = cpu_to_le64(prp_dma); - i = 1; - } - prp_list[i++] = cpu_to_le64(dma_addr); - dma_len -= PAGE_SIZE; - dma_addr += PAGE_SIZE; - length -= PAGE_SIZE; - if (length <= 0) - break; - if (dma_len > 0) - continue; - BUG_ON(dma_len < 0); - sg = sg_next(sg); - dma_addr = sg_dma_address(sg); - dma_len = sg_dma_len(sg); - } - - return total_len; -} - -/* NVMe scatterlists require no holes in the virtual address */ -#define BIOVEC_NOT_VIRT_MERGEABLE(vec1, vec2) ((vec2)->bv_offset || \ - (((vec1)->bv_offset + (vec1)->bv_len) % PAGE_SIZE)) - -static int nvme_map_bio(struct device *dev, struct nvme_iod *iod, - struct bio *bio, enum dma_data_direction dma_dir, int psegs) -{ - struct bio_vec *bvec, *bvprv = NULL; - struct scatterlist *sg = NULL; - int i, old_idx, length = 0, nsegs = 0; - - sg_init_table(iod->sg, psegs); - old_idx = bio->bi_idx; - bio_for_each_segment(bvec, bio, i) { - if (bvprv && BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) { - sg->length += bvec->bv_len; - } else { - if (bvprv && BIOVEC_NOT_VIRT_MERGEABLE(bvprv, bvec)) - break; - sg = sg ? sg + 1 : iod->sg; - sg_set_page(sg, bvec->bv_page, bvec->bv_len, - bvec->bv_offset); - nsegs++; - } - length += bvec->bv_len; - bvprv = bvec; - } - bio->bi_idx = i; - iod->nents = nsegs; - sg_mark_end(sg); - if (dma_map_sg(dev, iod->sg, iod->nents, dma_dir) == 0) { - bio->bi_idx = old_idx; - return -ENOMEM; - } - return length; -} - -static int nvme_submit_flush(struct nvme_queue *nvmeq, struct nvme_ns *ns, - int cmdid) -{ - struct nvme_command *cmnd = &nvmeq->sq_cmds[nvmeq->sq_tail]; - - memset(cmnd, 0, sizeof(*cmnd)); - cmnd->common.opcode = nvme_cmd_flush; - cmnd->common.command_id = cmdid; - cmnd->common.nsid = cpu_to_le32(ns->ns_id); - - if (++nvmeq->sq_tail == nvmeq->q_depth) - nvmeq->sq_tail = 0; - writel(nvmeq->sq_tail, nvmeq->q_db); - - return 0; -} - -static int nvme_submit_flush_data(struct nvme_queue *nvmeq, struct nvme_ns *ns) -{ - int cmdid = alloc_cmdid(nvmeq, (void *)CMD_CTX_FLUSH, - special_completion, NVME_IO_TIMEOUT); - if (unlikely(cmdid < 0)) - return cmdid; - - return nvme_submit_flush(nvmeq, ns, cmdid); -} - -/* - * Called with local interrupts disabled and the q_lock held. May not sleep. - */ -static int nvme_submit_bio_queue(struct nvme_queue *nvmeq, struct nvme_ns *ns, - struct bio *bio) -{ - struct nvme_command *cmnd; - struct nvme_iod *iod; - enum dma_data_direction dma_dir; - int cmdid, length, result = -ENOMEM; - u16 control; - u32 dsmgmt; - int psegs = bio_phys_segments(ns->queue, bio); - - if ((bio->bi_rw & REQ_FLUSH) && psegs) { - result = nvme_submit_flush_data(nvmeq, ns); - if (result) - return result; - } - - iod = nvme_alloc_iod(psegs, bio->bi_size, GFP_ATOMIC); - if (!iod) - goto nomem; - iod->private = bio; - - result = -EBUSY; - cmdid = alloc_cmdid(nvmeq, iod, bio_completion, NVME_IO_TIMEOUT); - if (unlikely(cmdid < 0)) - goto free_iod; - - if ((bio->bi_rw & REQ_FLUSH) && !psegs) - return nvme_submit_flush(nvmeq, ns, cmdid); - - control = 0; - if (bio->bi_rw & REQ_FUA) - control |= NVME_RW_FUA; - if (bio->bi_rw & (REQ_FAILFAST_DEV | REQ_RAHEAD)) - control |= NVME_RW_LR; - - dsmgmt = 0; - if (bio->bi_rw & REQ_RAHEAD) - dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH; - - cmnd = &nvmeq->sq_cmds[nvmeq->sq_tail]; - - memset(cmnd, 0, sizeof(*cmnd)); - if (bio_data_dir(bio)) { - cmnd->rw.opcode = nvme_cmd_write; - dma_dir = DMA_TO_DEVICE; - } else { - cmnd->rw.opcode = nvme_cmd_read; - dma_dir = DMA_FROM_DEVICE; - } - - result = nvme_map_bio(nvmeq->q_dmadev, iod, bio, dma_dir, psegs); - if (result < 0) - goto free_cmdid; - length = result; - - cmnd->rw.command_id = cmdid; - cmnd->rw.nsid = cpu_to_le32(ns->ns_id); - length = nvme_setup_prps(nvmeq->dev, &cmnd->common, iod, length, - GFP_ATOMIC); - cmnd->rw.slba = cpu_to_le64(bio->bi_sector >> (ns->lba_shift - 9)); - cmnd->rw.length = cpu_to_le16((length >> ns->lba_shift) - 1); - cmnd->rw.control = cpu_to_le16(control); - cmnd->rw.dsmgmt = cpu_to_le32(dsmgmt); - - bio->bi_sector += length >> 9; - - if (++nvmeq->sq_tail == nvmeq->q_depth) - nvmeq->sq_tail = 0; - writel(nvmeq->sq_tail, nvmeq->q_db); - - return 0; - - free_cmdid: - free_cmdid(nvmeq, cmdid, NULL); - free_iod: - nvme_free_iod(nvmeq->dev, iod); - nomem: - return result; -} - -static void nvme_make_request(struct request_queue *q, struct bio *bio) -{ - struct nvme_ns *ns = q->queuedata; - struct nvme_queue *nvmeq = get_nvmeq(ns->dev); - int result = -EBUSY; - - spin_lock_irq(&nvmeq->q_lock); - if (bio_list_empty(&nvmeq->sq_cong)) - result = nvme_submit_bio_queue(nvmeq, ns, bio); - if (unlikely(result)) { - if (bio_list_empty(&nvmeq->sq_cong)) - add_wait_queue(&nvmeq->sq_full, &nvmeq->sq_cong_wait); - bio_list_add(&nvmeq->sq_cong, bio); - } - - spin_unlock_irq(&nvmeq->q_lock); - put_nvmeq(nvmeq); -} - -static irqreturn_t nvme_process_cq(struct nvme_queue *nvmeq) -{ - u16 head, phase; - - head = nvmeq->cq_head; - phase = nvmeq->cq_phase; - - for (;;) { - void *ctx; - nvme_completion_fn fn; - struct nvme_completion cqe = nvmeq->cqes[head]; - if ((le16_to_cpu(cqe.status) & 1) != phase) - break; - nvmeq->sq_head = le16_to_cpu(cqe.sq_head); - if (++head == nvmeq->q_depth) { - head = 0; - phase = !phase; - } - - ctx = free_cmdid(nvmeq, cqe.command_id, &fn); - fn(nvmeq->dev, ctx, &cqe); - } - - /* If the controller ignores the cq head doorbell and continuously - * writes to the queue, it is theoretically possible to wrap around - * the queue twice and mistakenly return IRQ_NONE. Linux only - * requires that 0.1% of your interrupts are handled, so this isn't - * a big problem. - */ - if (head == nvmeq->cq_head && phase == nvmeq->cq_phase) - return IRQ_NONE; - - writel(head, nvmeq->q_db + (1 << nvmeq->dev->db_stride)); - nvmeq->cq_head = head; - nvmeq->cq_phase = phase; - - return IRQ_HANDLED; -} - -static irqreturn_t nvme_irq(int irq, void *data) -{ - irqreturn_t result; - struct nvme_queue *nvmeq = data; - spin_lock(&nvmeq->q_lock); - result = nvme_process_cq(nvmeq); - spin_unlock(&nvmeq->q_lock); - return result; -} - -static irqreturn_t nvme_irq_check(int irq, void *data) -{ - struct nvme_queue *nvmeq = data; - struct nvme_completion cqe = nvmeq->cqes[nvmeq->cq_head]; - if ((le16_to_cpu(cqe.status) & 1) != nvmeq->cq_phase) - return IRQ_NONE; - return IRQ_WAKE_THREAD; -} - -static void nvme_abort_command(struct nvme_queue *nvmeq, int cmdid) -{ - spin_lock_irq(&nvmeq->q_lock); - cancel_cmdid(nvmeq, cmdid, NULL); - spin_unlock_irq(&nvmeq->q_lock); -} - -struct sync_cmd_info { - struct task_struct *task; - u32 result; - int status; -}; - -static void sync_completion(struct nvme_dev *dev, void *ctx, - struct nvme_completion *cqe) -{ - struct sync_cmd_info *cmdinfo = ctx; - cmdinfo->result = le32_to_cpup(&cqe->result); - cmdinfo->status = le16_to_cpup(&cqe->status) >> 1; - wake_up_process(cmdinfo->task); -} - -/* - * Returns 0 on success. If the result is negative, it's a Linux error code; - * if the result is positive, it's an NVM Express status code - */ -static int nvme_submit_sync_cmd(struct nvme_queue *nvmeq, - struct nvme_command *cmd, u32 *result, unsigned timeout) -{ - int cmdid; - struct sync_cmd_info cmdinfo; - - cmdinfo.task = current; - cmdinfo.status = -EINTR; - - cmdid = alloc_cmdid_killable(nvmeq, &cmdinfo, sync_completion, - timeout); - if (cmdid < 0) - return cmdid; - cmd->common.command_id = cmdid; - - set_current_state(TASK_KILLABLE); - nvme_submit_cmd(nvmeq, cmd); - schedule(); - - if (cmdinfo.status == -EINTR) { - nvme_abort_command(nvmeq, cmdid); - return -EINTR; - } - - if (result) - *result = cmdinfo.result; - - return cmdinfo.status; -} - -static int nvme_submit_admin_cmd(struct nvme_dev *dev, struct nvme_command *cmd, - u32 *result) -{ - return nvme_submit_sync_cmd(dev->queues[0], cmd, result, ADMIN_TIMEOUT); -} - -static int adapter_delete_queue(struct nvme_dev *dev, u8 opcode, u16 id) -{ - int status; - struct nvme_command c; - - memset(&c, 0, sizeof(c)); - c.delete_queue.opcode = opcode; - c.delete_queue.qid = cpu_to_le16(id); - - status = nvme_submit_admin_cmd(dev, &c, NULL); - if (status) - return -EIO; - return 0; -} - -static int adapter_alloc_cq(struct nvme_dev *dev, u16 qid, - struct nvme_queue *nvmeq) -{ - int status; - struct nvme_command c; - int flags = NVME_QUEUE_PHYS_CONTIG | NVME_CQ_IRQ_ENABLED; - - memset(&c, 0, sizeof(c)); - c.create_cq.opcode = nvme_admin_create_cq; - c.create_cq.prp1 = cpu_to_le64(nvmeq->cq_dma_addr); - c.create_cq.cqid = cpu_to_le16(qid); - c.create_cq.qsize = cpu_to_le16(nvmeq->q_depth - 1); - c.create_cq.cq_flags = cpu_to_le16(flags); - c.create_cq.irq_vector = cpu_to_le16(nvmeq->cq_vector); - - status = nvme_submit_admin_cmd(dev, &c, NULL); - if (status) - return -EIO; - return 0; -} - -static int adapter_alloc_sq(struct nvme_dev *dev, u16 qid, - struct nvme_queue *nvmeq) -{ - int status; - struct nvme_command c; - int flags = NVME_QUEUE_PHYS_CONTIG | NVME_SQ_PRIO_MEDIUM; - - memset(&c, 0, sizeof(c)); - c.create_sq.opcode = nvme_admin_create_sq; - c.create_sq.prp1 = cpu_to_le64(nvmeq->sq_dma_addr); - c.create_sq.sqid = cpu_to_le16(qid); - c.create_sq.qsize = cpu_to_le16(nvmeq->q_depth - 1); - c.create_sq.sq_flags = cpu_to_le16(flags); - c.create_sq.cqid = cpu_to_le16(qid); - - status = nvme_submit_admin_cmd(dev, &c, NULL); - if (status) - return -EIO; - return 0; -} - -static int adapter_delete_cq(struct nvme_dev *dev, u16 cqid) -{ - return adapter_delete_queue(dev, nvme_admin_delete_cq, cqid); -} - -static int adapter_delete_sq(struct nvme_dev *dev, u16 sqid) -{ - return adapter_delete_queue(dev, nvme_admin_delete_sq, sqid); -} - -static int nvme_identify(struct nvme_dev *dev, unsigned nsid, unsigned cns, - dma_addr_t dma_addr) -{ - struct nvme_command c; - - memset(&c, 0, sizeof(c)); - c.identify.opcode = nvme_admin_identify; - c.identify.nsid = cpu_to_le32(nsid); - c.identify.prp1 = cpu_to_le64(dma_addr); - c.identify.cns = cpu_to_le32(cns); - - return nvme_submit_admin_cmd(dev, &c, NULL); -} - -static int nvme_get_features(struct nvme_dev *dev, unsigned fid, unsigned nsid, - dma_addr_t dma_addr, u32 *result) -{ - struct nvme_command c; - - memset(&c, 0, sizeof(c)); - c.features.opcode = nvme_admin_get_features; - c.features.nsid = cpu_to_le32(nsid); - c.features.prp1 = cpu_to_le64(dma_addr); - c.features.fid = cpu_to_le32(fid); - - return nvme_submit_admin_cmd(dev, &c, result); -} - -static int nvme_set_features(struct nvme_dev *dev, unsigned fid, - unsigned dword11, dma_addr_t dma_addr, u32 *result) -{ - struct nvme_command c; - - memset(&c, 0, sizeof(c)); - c.features.opcode = nvme_admin_set_features; - c.features.prp1 = cpu_to_le64(dma_addr); - c.features.fid = cpu_to_le32(fid); - c.features.dword11 = cpu_to_le32(dword11); - - return nvme_submit_admin_cmd(dev, &c, result); -} - -/** - * nvme_cancel_ios - Cancel outstanding I/Os - * @queue: The queue to cancel I/Os on - * @timeout: True to only cancel I/Os which have timed out - */ -static void nvme_cancel_ios(struct nvme_queue *nvmeq, bool timeout) -{ - int depth = nvmeq->q_depth - 1; - struct nvme_cmd_info *info = nvme_cmd_info(nvmeq); - unsigned long now = jiffies; - int cmdid; - - for_each_set_bit(cmdid, nvmeq->cmdid_data, depth) { - void *ctx; - nvme_completion_fn fn; - static struct nvme_completion cqe = { - .status = cpu_to_le16(NVME_SC_ABORT_REQ) << 1, - }; - - if (timeout && !time_after(now, info[cmdid].timeout)) - continue; - dev_warn(nvmeq->q_dmadev, "Cancelling I/O %d\n", cmdid); - ctx = cancel_cmdid(nvmeq, cmdid, &fn); - fn(nvmeq->dev, ctx, &cqe); - } -} - -static void nvme_free_queue_mem(struct nvme_queue *nvmeq) -{ - dma_free_coherent(nvmeq->q_dmadev, CQ_SIZE(nvmeq->q_depth), - (void *)nvmeq->cqes, nvmeq->cq_dma_addr); - dma_free_coherent(nvmeq->q_dmadev, SQ_SIZE(nvmeq->q_depth), - nvmeq->sq_cmds, nvmeq->sq_dma_addr); - kfree(nvmeq); -} - -static void nvme_free_queue(struct nvme_dev *dev, int qid) -{ - struct nvme_queue *nvmeq = dev->queues[qid]; - int vector = dev->entry[nvmeq->cq_vector].vector; - - spin_lock_irq(&nvmeq->q_lock); - nvme_cancel_ios(nvmeq, false); - while (bio_list_peek(&nvmeq->sq_cong)) { - struct bio *bio = bio_list_pop(&nvmeq->sq_cong); - bio_endio(bio, -EIO); - } - spin_unlock_irq(&nvmeq->q_lock); - - irq_set_affinity_hint(vector, NULL); - free_irq(vector, nvmeq); - - /* Don't tell the adapter to delete the admin queue */ - if (qid) { - adapter_delete_sq(dev, qid); - adapter_delete_cq(dev, qid); - } - - nvme_free_queue_mem(nvmeq); -} - -static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev, int qid, - int depth, int vector) -{ - struct device *dmadev = &dev->pci_dev->dev; - unsigned extra = DIV_ROUND_UP(depth, 8) + (depth * - sizeof(struct nvme_cmd_info)); - struct nvme_queue *nvmeq = kzalloc(sizeof(*nvmeq) + extra, GFP_KERNEL); - if (!nvmeq) - return NULL; - - nvmeq->cqes = dma_alloc_coherent(dmadev, CQ_SIZE(depth), - &nvmeq->cq_dma_addr, GFP_KERNEL); - if (!nvmeq->cqes) - goto free_nvmeq; - memset((void *)nvmeq->cqes, 0, CQ_SIZE(depth)); - - nvmeq->sq_cmds = dma_alloc_coherent(dmadev, SQ_SIZE(depth), - &nvmeq->sq_dma_addr, GFP_KERNEL); - if (!nvmeq->sq_cmds) - goto free_cqdma; - - nvmeq->q_dmadev = dmadev; - nvmeq->dev = dev; - spin_lock_init(&nvmeq->q_lock); - nvmeq->cq_head = 0; - nvmeq->cq_phase = 1; - init_waitqueue_head(&nvmeq->sq_full); - init_waitqueue_entry(&nvmeq->sq_cong_wait, nvme_thread); - bio_list_init(&nvmeq->sq_cong); - nvmeq->q_db = &dev->dbs[qid << (dev->db_stride + 1)]; - nvmeq->q_depth = depth; - nvmeq->cq_vector = vector; - - return nvmeq; - - free_cqdma: - dma_free_coherent(dmadev, CQ_SIZE(nvmeq->q_depth), (void *)nvmeq->cqes, - nvmeq->cq_dma_addr); - free_nvmeq: - kfree(nvmeq); - return NULL; -} - -static int queue_request_irq(struct nvme_dev *dev, struct nvme_queue *nvmeq, - const char *name) -{ - if (use_threaded_interrupts) - return request_threaded_irq(dev->entry[nvmeq->cq_vector].vector, - nvme_irq_check, nvme_irq, - IRQF_DISABLED | IRQF_SHARED, - name, nvmeq); - return request_irq(dev->entry[nvmeq->cq_vector].vector, nvme_irq, - IRQF_DISABLED | IRQF_SHARED, name, nvmeq); -} - -static struct nvme_queue *nvme_create_queue(struct nvme_dev *dev, int qid, - int cq_size, int vector) -{ - int result; - struct nvme_queue *nvmeq = nvme_alloc_queue(dev, qid, cq_size, vector); - - if (!nvmeq) - return ERR_PTR(-ENOMEM); - - result = adapter_alloc_cq(dev, qid, nvmeq); - if (result < 0) - goto free_nvmeq; - - result = adapter_alloc_sq(dev, qid, nvmeq); - if (result < 0) - goto release_cq; - - result = queue_request_irq(dev, nvmeq, "nvme"); - if (result < 0) - goto release_sq; - - return nvmeq; - - release_sq: - adapter_delete_sq(dev, qid); - release_cq: - adapter_delete_cq(dev, qid); - free_nvmeq: - dma_free_coherent(nvmeq->q_dmadev, CQ_SIZE(nvmeq->q_depth), - (void *)nvmeq->cqes, nvmeq->cq_dma_addr); - dma_free_coherent(nvmeq->q_dmadev, SQ_SIZE(nvmeq->q_depth), - nvmeq->sq_cmds, nvmeq->sq_dma_addr); - kfree(nvmeq); - return ERR_PTR(result); -} - -static int nvme_configure_admin_queue(struct nvme_dev *dev) -{ - int result = 0; - u32 aqa; - u64 cap; - unsigned long timeout; - struct nvme_queue *nvmeq; - - dev->dbs = ((void __iomem *)dev->bar) + 4096; - - nvmeq = nvme_alloc_queue(dev, 0, 64, 0); - if (!nvmeq) - return -ENOMEM; - - aqa = nvmeq->q_depth - 1; - aqa |= aqa << 16; - - dev->ctrl_config = NVME_CC_ENABLE | NVME_CC_CSS_NVM; - dev->ctrl_config |= (PAGE_SHIFT - 12) << NVME_CC_MPS_SHIFT; - dev->ctrl_config |= NVME_CC_ARB_RR | NVME_CC_SHN_NONE; - dev->ctrl_config |= NVME_CC_IOSQES | NVME_CC_IOCQES; - - writel(0, &dev->bar->cc); - writel(aqa, &dev->bar->aqa); - writeq(nvmeq->sq_dma_addr, &dev->bar->asq); - writeq(nvmeq->cq_dma_addr, &dev->bar->acq); - writel(dev->ctrl_config, &dev->bar->cc); - - cap = readq(&dev->bar->cap); - timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies; - dev->db_stride = NVME_CAP_STRIDE(cap); - - while (!result && !(readl(&dev->bar->csts) & NVME_CSTS_RDY)) { - msleep(100); - if (fatal_signal_pending(current)) - result = -EINTR; - if (time_after(jiffies, timeout)) { - dev_err(&dev->pci_dev->dev, - "Device not ready; aborting initialisation\n"); - result = -ENODEV; - } - } - - if (result) { - nvme_free_queue_mem(nvmeq); - return result; - } - - result = queue_request_irq(dev, nvmeq, "nvme admin"); - dev->queues[0] = nvmeq; - return result; -} - -static struct nvme_iod *nvme_map_user_pages(struct nvme_dev *dev, int write, - unsigned long addr, unsigned length) -{ - int i, err, count, nents, offset; - struct scatterlist *sg; - struct page **pages; - struct nvme_iod *iod; - - if (addr & 3) - return ERR_PTR(-EINVAL); - if (!length) - return ERR_PTR(-EINVAL); - - offset = offset_in_page(addr); - count = DIV_ROUND_UP(offset + length, PAGE_SIZE); - pages = kcalloc(count, sizeof(*pages), GFP_KERNEL); - if (!pages) - return ERR_PTR(-ENOMEM); - - err = get_user_pages_fast(addr, count, 1, pages); - if (err < count) { - count = err; - err = -EFAULT; - goto put_pages; - } - - iod = nvme_alloc_iod(count, length, GFP_KERNEL); - sg = iod->sg; - sg_init_table(sg, count); - for (i = 0; i < count; i++) { - sg_set_page(&sg[i], pages[i], - min_t(int, length, PAGE_SIZE - offset), offset); - length -= (PAGE_SIZE - offset); - offset = 0; - } - sg_mark_end(&sg[i - 1]); - iod->nents = count; - - err = -ENOMEM; - nents = dma_map_sg(&dev->pci_dev->dev, sg, count, - write ? DMA_TO_DEVICE : DMA_FROM_DEVICE); - if (!nents) - goto free_iod; - - kfree(pages); - return iod; - - free_iod: - kfree(iod); - put_pages: - for (i = 0; i < count; i++) - put_page(pages[i]); - kfree(pages); - return ERR_PTR(err); -} - -static void nvme_unmap_user_pages(struct nvme_dev *dev, int write, - struct nvme_iod *iod) -{ - int i; - - dma_unmap_sg(&dev->pci_dev->dev, iod->sg, iod->nents, - write ? DMA_TO_DEVICE : DMA_FROM_DEVICE); - - for (i = 0; i < iod->nents; i++) - put_page(sg_page(&iod->sg[i])); -} - -static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio) -{ - struct nvme_dev *dev = ns->dev; - struct nvme_queue *nvmeq; - struct nvme_user_io io; - struct nvme_command c; - unsigned length; - int status; - struct nvme_iod *iod; - - if (copy_from_user(&io, uio, sizeof(io))) - return -EFAULT; - length = (io.nblocks + 1) << ns->lba_shift; - - switch (io.opcode) { - case nvme_cmd_write: - case nvme_cmd_read: - case nvme_cmd_compare: - iod = nvme_map_user_pages(dev, io.opcode & 1, io.addr, length); - break; - default: - return -EINVAL; - } - - if (IS_ERR(iod)) - return PTR_ERR(iod); - - memset(&c, 0, sizeof(c)); - c.rw.opcode = io.opcode; - c.rw.flags = io.flags; - c.rw.nsid = cpu_to_le32(ns->ns_id); - c.rw.slba = cpu_to_le64(io.slba); - c.rw.length = cpu_to_le16(io.nblocks); - c.rw.control = cpu_to_le16(io.control); - c.rw.dsmgmt = cpu_to_le16(io.dsmgmt); - c.rw.reftag = io.reftag; - c.rw.apptag = io.apptag; - c.rw.appmask = io.appmask; - /* XXX: metadata */ - length = nvme_setup_prps(dev, &c.common, iod, length, GFP_KERNEL); - - nvmeq = get_nvmeq(dev); - /* - * Since nvme_submit_sync_cmd sleeps, we can't keep preemption - * disabled. We may be preempted at any point, and be rescheduled - * to a different CPU. That will cause cacheline bouncing, but no - * additional races since q_lock already protects against other CPUs. - */ - put_nvmeq(nvmeq); - if (length != (io.nblocks + 1) << ns->lba_shift) - status = -ENOMEM; - else - status = nvme_submit_sync_cmd(nvmeq, &c, NULL, NVME_IO_TIMEOUT); - - nvme_unmap_user_pages(dev, io.opcode & 1, iod); - nvme_free_iod(dev, iod); - return status; -} - -static int nvme_user_admin_cmd(struct nvme_dev *dev, - struct nvme_admin_cmd __user *ucmd) -{ - struct nvme_admin_cmd cmd; - struct nvme_command c; - int status, length; - struct nvme_iod *uninitialized_var(iod); - - if (!capable(CAP_SYS_ADMIN)) - return -EACCES; - if (copy_from_user(&cmd, ucmd, sizeof(cmd))) - return -EFAULT; - - memset(&c, 0, sizeof(c)); - c.common.opcode = cmd.opcode; - c.common.flags = cmd.flags; - c.common.nsid = cpu_to_le32(cmd.nsid); - c.common.cdw2[0] = cpu_to_le32(cmd.cdw2); - c.common.cdw2[1] = cpu_to_le32(cmd.cdw3); - c.common.cdw10[0] = cpu_to_le32(cmd.cdw10); - c.common.cdw10[1] = cpu_to_le32(cmd.cdw11); - c.common.cdw10[2] = cpu_to_le32(cmd.cdw12); - c.common.cdw10[3] = cpu_to_le32(cmd.cdw13); - c.common.cdw10[4] = cpu_to_le32(cmd.cdw14); - c.common.cdw10[5] = cpu_to_le32(cmd.cdw15); - - length = cmd.data_len; - if (cmd.data_len) { - iod = nvme_map_user_pages(dev, cmd.opcode & 1, cmd.addr, - length); - if (IS_ERR(iod)) - return PTR_ERR(iod); - length = nvme_setup_prps(dev, &c.common, iod, length, - GFP_KERNEL); - } - - if (length != cmd.data_len) - status = -ENOMEM; - else - status = nvme_submit_admin_cmd(dev, &c, &cmd.result); - - if (cmd.data_len) { - nvme_unmap_user_pages(dev, cmd.opcode & 1, iod); - nvme_free_iod(dev, iod); - } - - if (!status && copy_to_user(&ucmd->result, &cmd.result, - sizeof(cmd.result))) - status = -EFAULT; - - return status; -} - -static int nvme_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd, - unsigned long arg) -{ - struct nvme_ns *ns = bdev->bd_disk->private_data; - - switch (cmd) { - case NVME_IOCTL_ID: - return ns->ns_id; - case NVME_IOCTL_ADMIN_CMD: - return nvme_user_admin_cmd(ns->dev, (void __user *)arg); - case NVME_IOCTL_SUBMIT_IO: - return nvme_submit_io(ns, (void __user *)arg); - default: - return -ENOTTY; - } -} - -static const struct block_device_operations nvme_fops = { - .owner = THIS_MODULE, - .ioctl = nvme_ioctl, - .compat_ioctl = nvme_ioctl, -}; - -static void nvme_resubmit_bios(struct nvme_queue *nvmeq) -{ - while (bio_list_peek(&nvmeq->sq_cong)) { - struct bio *bio = bio_list_pop(&nvmeq->sq_cong); - struct nvme_ns *ns = bio->bi_bdev->bd_disk->private_data; - if (nvme_submit_bio_queue(nvmeq, ns, bio)) { - bio_list_add_head(&nvmeq->sq_cong, bio); - break; - } - if (bio_list_empty(&nvmeq->sq_cong)) - remove_wait_queue(&nvmeq->sq_full, - &nvmeq->sq_cong_wait); - } -} - -static int nvme_kthread(void *data) -{ - struct nvme_dev *dev; - - while (!kthread_should_stop()) { - __set_current_state(TASK_RUNNING); - spin_lock(&dev_list_lock); - list_for_each_entry(dev, &dev_list, node) { - int i; - for (i = 0; i < dev->queue_count; i++) { - struct nvme_queue *nvmeq = dev->queues[i]; - if (!nvmeq) - continue; - spin_lock_irq(&nvmeq->q_lock); - if (nvme_process_cq(nvmeq)) - printk("process_cq did something\n"); - nvme_cancel_ios(nvmeq, true); - nvme_resubmit_bios(nvmeq); - spin_unlock_irq(&nvmeq->q_lock); - } - } - spin_unlock(&dev_list_lock); - set_current_state(TASK_INTERRUPTIBLE); - schedule_timeout(HZ); - } - return 0; -} - -static DEFINE_IDA(nvme_index_ida); - -static int nvme_get_ns_idx(void) -{ - int index, error; - - do { - if (!ida_pre_get(&nvme_index_ida, GFP_KERNEL)) - return -1; - - spin_lock(&dev_list_lock); - error = ida_get_new(&nvme_index_ida, &index); - spin_unlock(&dev_list_lock); - } while (error == -EAGAIN); - - if (error) - index = -1; - return index; -} - -static void nvme_put_ns_idx(int index) -{ - spin_lock(&dev_list_lock); - ida_remove(&nvme_index_ida, index); - spin_unlock(&dev_list_lock); -} - -static struct nvme_ns *nvme_alloc_ns(struct nvme_dev *dev, int nsid, - struct nvme_id_ns *id, struct nvme_lba_range_type *rt) -{ - struct nvme_ns *ns; - struct gendisk *disk; - int lbaf; - - if (rt->attributes & NVME_LBART_ATTRIB_HIDE) - return NULL; - - ns = kzalloc(sizeof(*ns), GFP_KERNEL); - if (!ns) - return NULL; - ns->queue = blk_alloc_queue(GFP_KERNEL); - if (!ns->queue) - goto out_free_ns; - ns->queue->queue_flags = QUEUE_FLAG_DEFAULT; - queue_flag_set_unlocked(QUEUE_FLAG_NOMERGES, ns->queue); - queue_flag_set_unlocked(QUEUE_FLAG_NONROT, ns->queue); -/* queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, ns->queue); */ - blk_queue_make_request(ns->queue, nvme_make_request); - ns->dev = dev; - ns->queue->queuedata = ns; - - disk = alloc_disk(NVME_MINORS); - if (!disk) - goto out_free_queue; - ns->ns_id = nsid; - ns->disk = disk; - lbaf = id->flbas & 0xf; - ns->lba_shift = id->lbaf[lbaf].ds; - blk_queue_logical_block_size(ns->queue, 1 << ns->lba_shift); - if (dev->max_hw_sectors) - blk_queue_max_hw_sectors(ns->queue, dev->max_hw_sectors); - - disk->major = nvme_major; - disk->minors = NVME_MINORS; - disk->first_minor = NVME_MINORS * nvme_get_ns_idx(); - disk->fops = &nvme_fops; - disk->private_data = ns; - disk->queue = ns->queue; - disk->driverfs_dev = &dev->pci_dev->dev; - sprintf(disk->disk_name, "nvme%dn%d", dev->instance, nsid); - set_capacity(disk, le64_to_cpup(&id->nsze) << (ns->lba_shift - 9)); - - return ns; - - out_free_queue: - blk_cleanup_queue(ns->queue); - out_free_ns: - kfree(ns); - return NULL; -} - -static void nvme_ns_free(struct nvme_ns *ns) -{ - int index = ns->disk->first_minor / NVME_MINORS; - put_disk(ns->disk); - nvme_put_ns_idx(index); - blk_cleanup_queue(ns->queue); - kfree(ns); -} - -static int set_queue_count(struct nvme_dev *dev, int count) -{ - int status; - u32 result; - u32 q_count = (count - 1) | ((count - 1) << 16); - - status = nvme_set_features(dev, NVME_FEAT_NUM_QUEUES, q_count, 0, - &result); - if (status) - return -EIO; - return min(result & 0xffff, result >> 16) + 1; -} - -static int nvme_setup_io_queues(struct nvme_dev *dev) -{ - int result, cpu, i, nr_io_queues, db_bar_size, q_depth; - - nr_io_queues = num_online_cpus(); - result = set_queue_count(dev, nr_io_queues); - if (result < 0) - return result; - if (result < nr_io_queues) - nr_io_queues = result; - - /* Deregister the admin queue's interrupt */ - free_irq(dev->entry[0].vector, dev->queues[0]); - - db_bar_size = 4096 + ((nr_io_queues + 1) << (dev->db_stride + 3)); - if (db_bar_size > 8192) { - iounmap(dev->bar); - dev->bar = ioremap(pci_resource_start(dev->pci_dev, 0), - db_bar_size); - dev->dbs = ((void __iomem *)dev->bar) + 4096; - dev->queues[0]->q_db = dev->dbs; - } - - for (i = 0; i < nr_io_queues; i++) - dev->entry[i].entry = i; - for (;;) { - result = pci_enable_msix(dev->pci_dev, dev->entry, - nr_io_queues); - if (result == 0) { - break; - } else if (result > 0) { - nr_io_queues = result; - continue; - } else { - nr_io_queues = 1; - break; - } - } - - result = queue_request_irq(dev, dev->queues[0], "nvme admin"); - /* XXX: handle failure here */ - - cpu = cpumask_first(cpu_online_mask); - for (i = 0; i < nr_io_queues; i++) { - irq_set_affinity_hint(dev->entry[i].vector, get_cpu_mask(cpu)); - cpu = cpumask_next(cpu, cpu_online_mask); - } - - q_depth = min_t(int, NVME_CAP_MQES(readq(&dev->bar->cap)) + 1, - NVME_Q_DEPTH); - for (i = 0; i < nr_io_queues; i++) { - dev->queues[i + 1] = nvme_create_queue(dev, i + 1, q_depth, i); - if (IS_ERR(dev->queues[i + 1])) - return PTR_ERR(dev->queues[i + 1]); - dev->queue_count++; - } - - for (; i < num_possible_cpus(); i++) { - int target = i % rounddown_pow_of_two(dev->queue_count - 1); - dev->queues[i + 1] = dev->queues[target + 1]; - } - - return 0; -} - -static void nvme_free_queues(struct nvme_dev *dev) -{ - int i; - - for (i = dev->queue_count - 1; i >= 0; i--) - nvme_free_queue(dev, i); -} - -static int nvme_dev_add(struct nvme_dev *dev) -{ - int res, nn, i; - struct nvme_ns *ns, *next; - struct nvme_id_ctrl *ctrl; - struct nvme_id_ns *id_ns; - void *mem; - dma_addr_t dma_addr; - - res = nvme_setup_io_queues(dev); - if (res) - return res; - - mem = dma_alloc_coherent(&dev->pci_dev->dev, 8192, &dma_addr, - GFP_KERNEL); - - res = nvme_identify(dev, 0, 1, dma_addr); - if (res) { - res = -EIO; - goto out_free; - } - - ctrl = mem; - nn = le32_to_cpup(&ctrl->nn); - memcpy(dev->serial, ctrl->sn, sizeof(ctrl->sn)); - memcpy(dev->model, ctrl->mn, sizeof(ctrl->mn)); - memcpy(dev->firmware_rev, ctrl->fr, sizeof(ctrl->fr)); - if (ctrl->mdts) { - int shift = NVME_CAP_MPSMIN(readq(&dev->bar->cap)) + 12; - dev->max_hw_sectors = 1 << (ctrl->mdts + shift - 9); - } - - id_ns = mem; - for (i = 1; i <= nn; i++) { - res = nvme_identify(dev, i, 0, dma_addr); - if (res) - continue; - - if (id_ns->ncap == 0) - continue; - - res = nvme_get_features(dev, NVME_FEAT_LBA_RANGE, i, - dma_addr + 4096, NULL); - if (res) - memset(mem + 4096, 0, 4096); - - ns = nvme_alloc_ns(dev, i, mem, mem + 4096); - if (ns) - list_add_tail(&ns->list, &dev->namespaces); - } - list_for_each_entry(ns, &dev->namespaces, list) - add_disk(ns->disk); - - goto out; - - out_free: - list_for_each_entry_safe(ns, next, &dev->namespaces, list) { - list_del(&ns->list); - nvme_ns_free(ns); - } - - out: - dma_free_coherent(&dev->pci_dev->dev, 8192, mem, dma_addr); - return res; -} - -static int nvme_dev_remove(struct nvme_dev *dev) -{ - struct nvme_ns *ns, *next; - - spin_lock(&dev_list_lock); - list_del(&dev->node); - spin_unlock(&dev_list_lock); - - list_for_each_entry_safe(ns, next, &dev->namespaces, list) { - list_del(&ns->list); - del_gendisk(ns->disk); - nvme_ns_free(ns); - } - - nvme_free_queues(dev); - - return 0; -} - -static int nvme_setup_prp_pools(struct nvme_dev *dev) -{ - struct device *dmadev = &dev->pci_dev->dev; - dev->prp_page_pool = dma_pool_create("prp list page", dmadev, - PAGE_SIZE, PAGE_SIZE, 0); - if (!dev->prp_page_pool) - return -ENOMEM; - - /* Optimisation for I/Os between 4k and 128k */ - dev->prp_small_pool = dma_pool_create("prp list 256", dmadev, - 256, 256, 0); - if (!dev->prp_small_pool) { - dma_pool_destroy(dev->prp_page_pool); - return -ENOMEM; - } - return 0; -} - -static void nvme_release_prp_pools(struct nvme_dev *dev) -{ - dma_pool_destroy(dev->prp_page_pool); - dma_pool_destroy(dev->prp_small_pool); -} - -static DEFINE_IDA(nvme_instance_ida); - -static int nvme_set_instance(struct nvme_dev *dev) -{ - int instance, error; - - do { - if (!ida_pre_get(&nvme_instance_ida, GFP_KERNEL)) - return -ENODEV; - - spin_lock(&dev_list_lock); - error = ida_get_new(&nvme_instance_ida, &instance); - spin_unlock(&dev_list_lock); - } while (error == -EAGAIN); - - if (error) - return -ENODEV; - - dev->instance = instance; - return 0; -} - -static void nvme_release_instance(struct nvme_dev *dev) -{ - spin_lock(&dev_list_lock); - ida_remove(&nvme_instance_ida, dev->instance); - spin_unlock(&dev_list_lock); -} - -static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) -{ - int bars, result = -ENOMEM; - struct nvme_dev *dev; - - dev = kzalloc(sizeof(*dev), GFP_KERNEL); - if (!dev) - return -ENOMEM; - dev->entry = kcalloc(num_possible_cpus(), sizeof(*dev->entry), - GFP_KERNEL); - if (!dev->entry) - goto free; - dev->queues = kcalloc(num_possible_cpus() + 1, sizeof(void *), - GFP_KERNEL); - if (!dev->queues) - goto free; - - if (pci_enable_device_mem(pdev)) - goto free; - pci_set_master(pdev); - bars = pci_select_bars(pdev, IORESOURCE_MEM); - if (pci_request_selected_regions(pdev, bars, "nvme")) - goto disable; - - INIT_LIST_HEAD(&dev->namespaces); - dev->pci_dev = pdev; - pci_set_drvdata(pdev, dev); - dma_set_mask(&pdev->dev, DMA_BIT_MASK(64)); - dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64)); - result = nvme_set_instance(dev); - if (result) - goto disable; - - dev->entry[0].vector = pdev->irq; - - result = nvme_setup_prp_pools(dev); - if (result) - goto disable_msix; - - dev->bar = ioremap(pci_resource_start(pdev, 0), 8192); - if (!dev->bar) { - result = -ENOMEM; - goto disable_msix; - } - - result = nvme_configure_admin_queue(dev); - if (result) - goto unmap; - dev->queue_count++; - - spin_lock(&dev_list_lock); - list_add(&dev->node, &dev_list); - spin_unlock(&dev_list_lock); - - result = nvme_dev_add(dev); - if (result) - goto delete; - - return 0; - - delete: - spin_lock(&dev_list_lock); - list_del(&dev->node); - spin_unlock(&dev_list_lock); - - nvme_free_queues(dev); - unmap: - iounmap(dev->bar); - disable_msix: - pci_disable_msix(pdev); - nvme_release_instance(dev); - nvme_release_prp_pools(dev); - disable: - pci_disable_device(pdev); - pci_release_regions(pdev); - free: - kfree(dev->queues); - kfree(dev->entry); - kfree(dev); - return result; -} - -static void nvme_remove(struct pci_dev *pdev) -{ - struct nvme_dev *dev = pci_get_drvdata(pdev); - nvme_dev_remove(dev); - pci_disable_msix(pdev); - iounmap(dev->bar); - nvme_release_instance(dev); - nvme_release_prp_pools(dev); - pci_disable_device(pdev); - pci_release_regions(pdev); - kfree(dev->queues); - kfree(dev->entry); - kfree(dev); -} - -/* These functions are yet to be implemented */ -#define nvme_error_detected NULL -#define nvme_dump_registers NULL -#define nvme_link_reset NULL -#define nvme_slot_reset NULL -#define nvme_error_resume NULL -#define nvme_suspend NULL -#define nvme_resume NULL - -static const struct pci_error_handlers nvme_err_handler = { - .error_detected = nvme_error_detected, - .mmio_enabled = nvme_dump_registers, - .link_reset = nvme_link_reset, - .slot_reset = nvme_slot_reset, - .resume = nvme_error_resume, -}; - -/* Move to pci_ids.h later */ -#define PCI_CLASS_STORAGE_EXPRESS 0x010802 - -static DEFINE_PCI_DEVICE_TABLE(nvme_id_table) = { - { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) }, - { 0, } -}; -MODULE_DEVICE_TABLE(pci, nvme_id_table); - -static struct pci_driver nvme_driver = { - .name = "nvme", - .id_table = nvme_id_table, - .probe = nvme_probe, - .remove = nvme_remove, - .suspend = nvme_suspend, - .resume = nvme_resume, - .err_handler = &nvme_err_handler, -}; - -static int __init nvme_init(void) -{ - int result; - - nvme_thread = kthread_run(nvme_kthread, NULL, "nvme"); - if (IS_ERR(nvme_thread)) - return PTR_ERR(nvme_thread); - - result = register_blkdev(nvme_major, "nvme"); - if (result < 0) - goto kill_kthread; - else if (result > 0) - nvme_major = result; - - result = pci_register_driver(&nvme_driver); - if (result) - goto unregister_blkdev; - return 0; - - unregister_blkdev: - unregister_blkdev(nvme_major, "nvme"); - kill_kthread: - kthread_stop(nvme_thread); - return result; -} - -static void __exit nvme_exit(void) -{ - pci_unregister_driver(&nvme_driver); - unregister_blkdev(nvme_major, "nvme"); - kthread_stop(nvme_thread); -} - -MODULE_AUTHOR("Matthew Wilcox "); -MODULE_LICENSE("GPL"); -MODULE_VERSION("0.8"); -module_init(nvme_init); -module_exit(nvme_exit); diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c index 9f2d348f7115..3c08983e600a 100644 --- a/drivers/block/pktcdvd.c +++ b/drivers/block/pktcdvd.c @@ -901,7 +901,7 @@ static void pkt_iosched_process_queue(struct pktcdvd_device *pd) pd->iosched.successive_reads += bio->bi_size >> 10; else { pd->iosched.successive_reads = 0; - pd->iosched.last_write = bio->bi_sector + bio_sectors(bio); + pd->iosched.last_write = bio_end_sector(bio); } if (pd->iosched.successive_reads >= HI_SPEED_SWITCH) { if (pd->read_speed == pd->write_speed) { @@ -947,31 +947,6 @@ static int pkt_set_segment_merging(struct pktcdvd_device *pd, struct request_que } } -/* - * Copy CD_FRAMESIZE bytes from src_bio into a destination page - */ -static void pkt_copy_bio_data(struct bio *src_bio, int seg, int offs, struct page *dst_page, int dst_offs) -{ - unsigned int copy_size = CD_FRAMESIZE; - - while (copy_size > 0) { - struct bio_vec *src_bvl = bio_iovec_idx(src_bio, seg); - void *vfrom = kmap_atomic(src_bvl->bv_page) + - src_bvl->bv_offset + offs; - void *vto = page_address(dst_page) + dst_offs; - int len = min_t(int, copy_size, src_bvl->bv_len - offs); - - BUG_ON(len < 0); - memcpy(vto, vfrom, len); - kunmap_atomic(vfrom); - - seg++; - offs = 0; - dst_offs += len; - copy_size -= len; - } -} - /* * Copy all data for this packet to pkt->pages[], so that * a) The number of required segments for the write bio is minimized, which @@ -1181,16 +1156,15 @@ static int pkt_start_recovery(struct packet_data *pkt) new_sector = new_block * (CD_FRAMESIZE >> 9); pkt->sector = new_sector; + bio_reset(pkt->bio); + pkt->bio->bi_bdev = pd->bdev; + pkt->bio->bi_rw = REQ_WRITE; pkt->bio->bi_sector = new_sector; - pkt->bio->bi_next = NULL; - pkt->bio->bi_flags = 1 << BIO_UPTODATE; - pkt->bio->bi_idx = 0; + pkt->bio->bi_size = pkt->frames * CD_FRAMESIZE; + pkt->bio->bi_vcnt = pkt->frames; - BUG_ON(pkt->bio->bi_rw != REQ_WRITE); - BUG_ON(pkt->bio->bi_vcnt != pkt->frames); - BUG_ON(pkt->bio->bi_size != pkt->frames * CD_FRAMESIZE); - BUG_ON(pkt->bio->bi_end_io != pkt_end_io_packet_write); - BUG_ON(pkt->bio->bi_private != pkt); + pkt->bio->bi_end_io = pkt_end_io_packet_write; + pkt->bio->bi_private = pkt; drop_super(sb); return 1; @@ -1325,55 +1299,35 @@ try_next_bio: */ static void pkt_start_write(struct pktcdvd_device *pd, struct packet_data *pkt) { - struct bio *bio; int f; - int frames_write; struct bio_vec *bvec = pkt->w_bio->bi_io_vec; + bio_reset(pkt->w_bio); + pkt->w_bio->bi_sector = pkt->sector; + pkt->w_bio->bi_bdev = pd->bdev; + pkt->w_bio->bi_end_io = pkt_end_io_packet_write; + pkt->w_bio->bi_private = pkt; + + /* XXX: locking? */ for (f = 0; f < pkt->frames; f++) { bvec[f].bv_page = pkt->pages[(f * CD_FRAMESIZE) / PAGE_SIZE]; bvec[f].bv_offset = (f * CD_FRAMESIZE) % PAGE_SIZE; + if (!bio_add_page(pkt->w_bio, bvec[f].bv_page, CD_FRAMESIZE, bvec[f].bv_offset)) + BUG(); } + VPRINTK(DRIVER_NAME": vcnt=%d\n", pkt->w_bio->bi_vcnt); /* * Fill-in bvec with data from orig_bios. */ - frames_write = 0; spin_lock(&pkt->lock); - bio_list_for_each(bio, &pkt->orig_bios) { - int segment = bio->bi_idx; - int src_offs = 0; - int first_frame = (bio->bi_sector - pkt->sector) / (CD_FRAMESIZE >> 9); - int num_frames = bio->bi_size / CD_FRAMESIZE; - BUG_ON(first_frame < 0); - BUG_ON(first_frame + num_frames > pkt->frames); - for (f = first_frame; f < first_frame + num_frames; f++) { - struct bio_vec *src_bvl = bio_iovec_idx(bio, segment); - - while (src_offs >= src_bvl->bv_len) { - src_offs -= src_bvl->bv_len; - segment++; - BUG_ON(segment >= bio->bi_vcnt); - src_bvl = bio_iovec_idx(bio, segment); - } + bio_copy_data(pkt->w_bio, pkt->orig_bios.head); - if (src_bvl->bv_len - src_offs >= CD_FRAMESIZE) { - bvec[f].bv_page = src_bvl->bv_page; - bvec[f].bv_offset = src_bvl->bv_offset + src_offs; - } else { - pkt_copy_bio_data(bio, segment, src_offs, - bvec[f].bv_page, bvec[f].bv_offset); - } - src_offs += CD_FRAMESIZE; - frames_write++; - } - } pkt_set_state(pkt, PACKET_WRITE_WAIT_STATE); spin_unlock(&pkt->lock); VPRINTK("pkt_start_write: Writing %d frames for zone %llx\n", - frames_write, (unsigned long long)pkt->sector); - BUG_ON(frames_write != pkt->write_size); + pkt->write_size, (unsigned long long)pkt->sector); if (test_bit(PACKET_MERGE_SEGS, &pd->flags) || (pkt->write_size < pkt->frames)) { pkt_make_local_copy(pkt, bvec); @@ -1383,16 +1337,6 @@ static void pkt_start_write(struct pktcdvd_device *pd, struct packet_data *pkt) } /* Start the write request */ - bio_reset(pkt->w_bio); - pkt->w_bio->bi_sector = pkt->sector; - pkt->w_bio->bi_bdev = pd->bdev; - pkt->w_bio->bi_end_io = pkt_end_io_packet_write; - pkt->w_bio->bi_private = pkt; - for (f = 0; f < pkt->frames; f++) - if (!bio_add_page(pkt->w_bio, bvec[f].bv_page, CD_FRAMESIZE, bvec[f].bv_offset)) - BUG(); - VPRINTK(DRIVER_NAME": vcnt=%d\n", pkt->w_bio->bi_vcnt); - atomic_set(&pkt->io_wait, 1); pkt->w_bio->bi_rw = WRITE; pkt_queue_bio(pd, pkt->w_bio); @@ -2431,7 +2375,7 @@ static void pkt_make_request(struct request_queue *q, struct bio *bio) cloned_bio->bi_bdev = pd->bdev; cloned_bio->bi_private = psd; cloned_bio->bi_end_io = pkt_end_io_read_cloned; - pd->stats.secs_r += bio->bi_size >> 9; + pd->stats.secs_r += bio_sectors(bio); pkt_queue_bio(pd, cloned_bio); return; } @@ -2452,7 +2396,7 @@ static void pkt_make_request(struct request_queue *q, struct bio *bio) zone = ZONE(bio->bi_sector, pd); VPRINTK("pkt_make_request: start = %6llx stop = %6llx\n", (unsigned long long)bio->bi_sector, - (unsigned long long)(bio->bi_sector + bio_sectors(bio))); + (unsigned long long)bio_end_sector(bio)); /* Check if we have to split the bio */ { @@ -2460,7 +2404,7 @@ static void pkt_make_request(struct request_queue *q, struct bio *bio) sector_t last_zone; int first_sectors; - last_zone = ZONE(bio->bi_sector + bio_sectors(bio) - 1, pd); + last_zone = ZONE(bio_end_sector(bio) - 1, pd); if (last_zone != zone) { BUG_ON(last_zone != zone + pd->settings.size); first_sectors = last_zone - bio->bi_sector; diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 22ffd5dcb168..ca63104136e0 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -1143,7 +1143,7 @@ static struct bio *bio_clone_range(struct bio *bio_src, /* Find first affected segment... */ resid = offset; - __bio_for_each_segment(bv, bio_src, idx, 0) { + bio_for_each_segment(bv, bio_src, idx) { if (resid < bv->bv_len) break; resid -= bv->bv_len; diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index aeaea32bcfda..e9924898043a 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -63,8 +63,6 @@ config INTEL_IOATDMA depends on PCI && X86 select DMA_ENGINE select DCA - select ASYNC_TX_DISABLE_PQ_VAL_DMA - select ASYNC_TX_DISABLE_XOR_VAL_DMA help Enable support for the Intel(R) I/OAT DMA engine present in recent Intel Xeon chipsets. @@ -174,15 +172,7 @@ config TEGRA20_APB_DMA This DMA controller transfers data from memory to peripheral fifo or vice versa. It does not support memory to memory data transfer. - - -config SH_DMAE - tristate "Renesas SuperH DMAC support" - depends on (SUPERH && SH_DMA) || (ARM && ARCH_SHMOBILE) - depends on !SH_DMA_API - select DMA_ENGINE - help - Enable support for the Renesas SuperH DMA controllers. +source "drivers/dma/sh/Kconfig" config COH901318 bool "ST-Ericsson COH901318 DMA support" @@ -328,6 +318,10 @@ config DMA_ENGINE config DMA_VIRTUAL_CHANNELS tristate +config DMA_ACPI + def_bool y + depends on ACPI + config DMA_OF def_bool y depends on OF diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile index 488e3ff85b52..a2b0df591f95 100644 --- a/drivers/dma/Makefile +++ b/drivers/dma/Makefile @@ -3,6 +3,7 @@ ccflags-$(CONFIG_DMADEVICES_VDEBUG) += -DVERBOSE_DEBUG obj-$(CONFIG_DMA_ENGINE) += dmaengine.o obj-$(CONFIG_DMA_VIRTUAL_CHANNELS) += virt-dma.o +obj-$(CONFIG_DMA_ACPI) += acpi-dma.o obj-$(CONFIG_DMA_OF) += of-dma.o obj-$(CONFIG_NET_DMA) += iovlock.o @@ -18,7 +19,7 @@ obj-$(CONFIG_DW_DMAC) += dw_dmac.o obj-$(CONFIG_AT_HDMAC) += at_hdmac.o obj-$(CONFIG_MX3_IPU) += ipu/ obj-$(CONFIG_TXX9_DMAC) += txx9dmac.o -obj-$(CONFIG_SH_DMAE) += sh/ +obj-$(CONFIG_SH_DMAE_BASE) += sh/ obj-$(CONFIG_COH901318) += coh901318.o coh901318_lli.o obj-$(CONFIG_AMCC_PPC440SPE_ADMA) += ppc4xx/ obj-$(CONFIG_IMX_SDMA) += imx-sdma.o diff --git a/drivers/dma/acpi-dma.c b/drivers/dma/acpi-dma.c new file mode 100644 index 000000000000..ba6fc62e9651 --- /dev/null +++ b/drivers/dma/acpi-dma.c @@ -0,0 +1,279 @@ +/* + * ACPI helpers for DMA request / controller + * + * Based on of-dma.c + * + * Copyright (C) 2013, Intel Corporation + * Author: Andy Shevchenko + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include +#include +#include + +static LIST_HEAD(acpi_dma_list); +static DEFINE_MUTEX(acpi_dma_lock); + +/** + * acpi_dma_controller_register - Register a DMA controller to ACPI DMA helpers + * @dev: struct device of DMA controller + * @acpi_dma_xlate: translation function which converts a dma specifier + * into a dma_chan structure + * @data pointer to controller specific data to be used by + * translation function + * + * Returns 0 on success or appropriate errno value on error. + * + * Allocated memory should be freed with appropriate acpi_dma_controller_free() + * call. + */ +int acpi_dma_controller_register(struct device *dev, + struct dma_chan *(*acpi_dma_xlate) + (struct acpi_dma_spec *, struct acpi_dma *), + void *data) +{ + struct acpi_device *adev; + struct acpi_dma *adma; + + if (!dev || !acpi_dma_xlate) + return -EINVAL; + + /* Check if the device was enumerated by ACPI */ + if (!ACPI_HANDLE(dev)) + return -EINVAL; + + if (acpi_bus_get_device(ACPI_HANDLE(dev), &adev)) + return -EINVAL; + + adma = kzalloc(sizeof(*adma), GFP_KERNEL); + if (!adma) + return -ENOMEM; + + adma->dev = dev; + adma->acpi_dma_xlate = acpi_dma_xlate; + adma->data = data; + + /* Now queue acpi_dma controller structure in list */ + mutex_lock(&acpi_dma_lock); + list_add_tail(&adma->dma_controllers, &acpi_dma_list); + mutex_unlock(&acpi_dma_lock); + + return 0; +} +EXPORT_SYMBOL_GPL(acpi_dma_controller_register); + +/** + * acpi_dma_controller_free - Remove a DMA controller from ACPI DMA helpers list + * @dev: struct device of DMA controller + * + * Memory allocated by acpi_dma_controller_register() is freed here. + */ +int acpi_dma_controller_free(struct device *dev) +{ + struct acpi_dma *adma; + + if (!dev) + return -EINVAL; + + mutex_lock(&acpi_dma_lock); + + list_for_each_entry(adma, &acpi_dma_list, dma_controllers) + if (adma->dev == dev) { + list_del(&adma->dma_controllers); + mutex_unlock(&acpi_dma_lock); + kfree(adma); + return 0; + } + + mutex_unlock(&acpi_dma_lock); + return -ENODEV; +} +EXPORT_SYMBOL_GPL(acpi_dma_controller_free); + +static void devm_acpi_dma_release(struct device *dev, void *res) +{ + acpi_dma_controller_free(dev); +} + +/** + * devm_acpi_dma_controller_register - resource managed acpi_dma_controller_register() + * @dev: device that is registering this DMA controller + * @acpi_dma_xlate: translation function + * @data pointer to controller specific data + * + * Managed acpi_dma_controller_register(). DMA controller registered by this + * function are automatically freed on driver detach. See + * acpi_dma_controller_register() for more information. + */ +int devm_acpi_dma_controller_register(struct device *dev, + struct dma_chan *(*acpi_dma_xlate) + (struct acpi_dma_spec *, struct acpi_dma *), + void *data) +{ + void *res; + int ret; + + res = devres_alloc(devm_acpi_dma_release, 0, GFP_KERNEL); + if (!res) + return -ENOMEM; + + ret = acpi_dma_controller_register(dev, acpi_dma_xlate, data); + if (ret) { + devres_free(res); + return ret; + } + devres_add(dev, res); + return 0; +} +EXPORT_SYMBOL_GPL(devm_acpi_dma_controller_register); + +/** + * devm_acpi_dma_controller_free - resource managed acpi_dma_controller_free() + * + * Unregister a DMA controller registered with + * devm_acpi_dma_controller_register(). Normally this function will not need to + * be called and the resource management code will ensure that the resource is + * freed. + */ +void devm_acpi_dma_controller_free(struct device *dev) +{ + WARN_ON(devres_destroy(dev, devm_acpi_dma_release, NULL, NULL)); +} +EXPORT_SYMBOL_GPL(devm_acpi_dma_controller_free); + +struct acpi_dma_parser_data { + struct acpi_dma_spec dma_spec; + size_t index; + size_t n; +}; + +/** + * acpi_dma_parse_fixed_dma - Parse FixedDMA ACPI resources to a DMA specifier + * @res: struct acpi_resource to get FixedDMA resources from + * @data: pointer to a helper struct acpi_dma_parser_data + */ +static int acpi_dma_parse_fixed_dma(struct acpi_resource *res, void *data) +{ + struct acpi_dma_parser_data *pdata = data; + + if (res->type == ACPI_RESOURCE_TYPE_FIXED_DMA) { + struct acpi_resource_fixed_dma *dma = &res->data.fixed_dma; + + if (pdata->n++ == pdata->index) { + pdata->dma_spec.chan_id = dma->channels; + pdata->dma_spec.slave_id = dma->request_lines; + } + } + + /* Tell the ACPI core to skip this resource */ + return 1; +} + +/** + * acpi_dma_request_slave_chan_by_index - Get the DMA slave channel + * @dev: struct device to get DMA request from + * @index: index of FixedDMA descriptor for @dev + * + * Returns pointer to appropriate dma channel on success or NULL on error. + */ +struct dma_chan *acpi_dma_request_slave_chan_by_index(struct device *dev, + size_t index) +{ + struct acpi_dma_parser_data pdata; + struct acpi_dma_spec *dma_spec = &pdata.dma_spec; + struct list_head resource_list; + struct acpi_device *adev; + struct acpi_dma *adma; + struct dma_chan *chan = NULL; + + /* Check if the device was enumerated by ACPI */ + if (!dev || !ACPI_HANDLE(dev)) + return NULL; + + if (acpi_bus_get_device(ACPI_HANDLE(dev), &adev)) + return NULL; + + memset(&pdata, 0, sizeof(pdata)); + pdata.index = index; + + /* Initial values for the request line and channel */ + dma_spec->chan_id = -1; + dma_spec->slave_id = -1; + + INIT_LIST_HEAD(&resource_list); + acpi_dev_get_resources(adev, &resource_list, + acpi_dma_parse_fixed_dma, &pdata); + acpi_dev_free_resource_list(&resource_list); + + if (dma_spec->slave_id < 0 || dma_spec->chan_id < 0) + return NULL; + + mutex_lock(&acpi_dma_lock); + + list_for_each_entry(adma, &acpi_dma_list, dma_controllers) { + dma_spec->dev = adma->dev; + chan = adma->acpi_dma_xlate(dma_spec, adma); + if (chan) + break; + } + + mutex_unlock(&acpi_dma_lock); + return chan; +} +EXPORT_SYMBOL_GPL(acpi_dma_request_slave_chan_by_index); + +/** + * acpi_dma_request_slave_chan_by_name - Get the DMA slave channel + * @dev: struct device to get DMA request from + * @name: represents corresponding FixedDMA descriptor for @dev + * + * In order to support both Device Tree and ACPI in a single driver we + * translate the names "tx" and "rx" here based on the most common case where + * the first FixedDMA descriptor is TX and second is RX. + * + * Returns pointer to appropriate dma channel on success or NULL on error. + */ +struct dma_chan *acpi_dma_request_slave_chan_by_name(struct device *dev, + const char *name) +{ + size_t index; + + if (!strcmp(name, "tx")) + index = 0; + else if (!strcmp(name, "rx")) + index = 1; + else + return NULL; + + return acpi_dma_request_slave_chan_by_index(dev, index); +} +EXPORT_SYMBOL_GPL(acpi_dma_request_slave_chan_by_name); + +/** + * acpi_dma_simple_xlate - Simple ACPI DMA engine translation helper + * @dma_spec: pointer to ACPI DMA specifier + * @adma: pointer to ACPI DMA controller data + * + * A simple translation function for ACPI based devices. Passes &struct + * dma_spec to the DMA controller driver provided filter function. Returns + * pointer to the channel if found or %NULL otherwise. + */ +struct dma_chan *acpi_dma_simple_xlate(struct acpi_dma_spec *dma_spec, + struct acpi_dma *adma) +{ + struct acpi_dma_filter_info *info = adma->data; + + if (!info || !info->filter_fn) + return NULL; + + return dma_request_channel(info->dma_cap, info->filter_fn, dma_spec); +} +EXPORT_SYMBOL_GPL(acpi_dma_simple_xlate); diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c index 88cfc61329d2..e923cda930f9 100644 --- a/drivers/dma/at_hdmac.c +++ b/drivers/dma/at_hdmac.c @@ -24,6 +24,7 @@ #include #include #include +#include #include "at_hdmac_regs.h" #include "dmaengine.h" @@ -677,7 +678,7 @@ atc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, ctrlb |= ATC_DST_ADDR_MODE_FIXED | ATC_SRC_ADDR_MODE_INCR | ATC_FC_MEM2PER - | ATC_SIF(AT_DMA_MEM_IF) | ATC_DIF(AT_DMA_PER_IF); + | ATC_SIF(atchan->mem_if) | ATC_DIF(atchan->per_if); reg = sconfig->dst_addr; for_each_sg(sgl, sg, sg_len, i) { struct at_desc *desc; @@ -716,7 +717,7 @@ atc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, ctrlb |= ATC_DST_ADDR_MODE_INCR | ATC_SRC_ADDR_MODE_FIXED | ATC_FC_PER2MEM - | ATC_SIF(AT_DMA_PER_IF) | ATC_DIF(AT_DMA_MEM_IF); + | ATC_SIF(atchan->per_if) | ATC_DIF(atchan->mem_if); reg = sconfig->src_addr; for_each_sg(sgl, sg, sg_len, i) { @@ -822,8 +823,8 @@ atc_dma_cyclic_fill_desc(struct dma_chan *chan, struct at_desc *desc, desc->lli.ctrlb = ATC_DST_ADDR_MODE_FIXED | ATC_SRC_ADDR_MODE_INCR | ATC_FC_MEM2PER - | ATC_SIF(AT_DMA_MEM_IF) - | ATC_DIF(AT_DMA_PER_IF); + | ATC_SIF(atchan->mem_if) + | ATC_DIF(atchan->per_if); break; case DMA_DEV_TO_MEM: @@ -833,8 +834,8 @@ atc_dma_cyclic_fill_desc(struct dma_chan *chan, struct at_desc *desc, desc->lli.ctrlb = ATC_DST_ADDR_MODE_INCR | ATC_SRC_ADDR_MODE_FIXED | ATC_FC_PER2MEM - | ATC_SIF(AT_DMA_PER_IF) - | ATC_DIF(AT_DMA_MEM_IF); + | ATC_SIF(atchan->per_if) + | ATC_DIF(atchan->mem_if); break; default: @@ -1188,6 +1189,67 @@ static void atc_free_chan_resources(struct dma_chan *chan) dev_vdbg(chan2dev(chan), "free_chan_resources: done\n"); } +#ifdef CONFIG_OF +static bool at_dma_filter(struct dma_chan *chan, void *slave) +{ + struct at_dma_slave *atslave = slave; + + if (atslave->dma_dev == chan->device->dev) { + chan->private = atslave; + return true; + } else { + return false; + } +} + +static struct dma_chan *at_dma_xlate(struct of_phandle_args *dma_spec, + struct of_dma *of_dma) +{ + struct dma_chan *chan; + struct at_dma_chan *atchan; + struct at_dma_slave *atslave; + dma_cap_mask_t mask; + unsigned int per_id; + struct platform_device *dmac_pdev; + + if (dma_spec->args_count != 2) + return NULL; + + dmac_pdev = of_find_device_by_node(dma_spec->np); + + dma_cap_zero(mask); + dma_cap_set(DMA_SLAVE, mask); + + atslave = devm_kzalloc(&dmac_pdev->dev, sizeof(*atslave), GFP_KERNEL); + if (!atslave) + return NULL; + /* + * We can fill both SRC_PER and DST_PER, one of these fields will be + * ignored depending on DMA transfer direction. + */ + per_id = dma_spec->args[1]; + atslave->cfg = ATC_FIFOCFG_HALFFIFO | ATC_DST_H2SEL_HW + | ATC_SRC_H2SEL_HW | ATC_DST_PER(per_id) + | ATC_SRC_PER(per_id); + atslave->dma_dev = &dmac_pdev->dev; + + chan = dma_request_channel(mask, at_dma_filter, atslave); + if (!chan) + return NULL; + + atchan = to_at_dma_chan(chan); + atchan->per_if = dma_spec->args[0] & 0xff; + atchan->mem_if = (dma_spec->args[0] >> 16) & 0xff; + + return chan; +} +#else +static struct dma_chan *at_dma_xlate(struct of_phandle_args *dma_spec, + struct of_dma *of_dma) +{ + return NULL; +} +#endif /*-- Module Management -----------------------------------------------*/ @@ -1342,6 +1404,8 @@ static int __init at_dma_probe(struct platform_device *pdev) for (i = 0; i < plat_dat->nr_channels; i++) { struct at_dma_chan *atchan = &atdma->chan[i]; + atchan->mem_if = AT_DMA_MEM_IF; + atchan->per_if = AT_DMA_PER_IF; atchan->chan_common.device = &atdma->dma_common; dma_cookie_init(&atchan->chan_common); list_add_tail(&atchan->chan_common.device_node, @@ -1388,8 +1452,25 @@ static int __init at_dma_probe(struct platform_device *pdev) dma_async_device_register(&atdma->dma_common); + /* + * Do not return an error if the dmac node is not present in order to + * not break the existing way of requesting channel with + * dma_request_channel(). + */ + if (pdev->dev.of_node) { + err = of_dma_controller_register(pdev->dev.of_node, + at_dma_xlate, atdma); + if (err) { + dev_err(&pdev->dev, "could not register of_dma_controller\n"); + goto err_of_dma_controller_register; + } + } + return 0; +err_of_dma_controller_register: + dma_async_device_unregister(&atdma->dma_common); + dma_pool_destroy(atdma->dma_desc_pool); err_pool_create: platform_set_drvdata(pdev, NULL); free_irq(platform_get_irq(pdev, 0), atdma); @@ -1406,7 +1487,7 @@ err_kfree: return err; } -static int __exit at_dma_remove(struct platform_device *pdev) +static int at_dma_remove(struct platform_device *pdev) { struct at_dma *atdma = platform_get_drvdata(pdev); struct dma_chan *chan, *_chan; @@ -1564,7 +1645,7 @@ static const struct dev_pm_ops at_dma_dev_pm_ops = { }; static struct platform_driver at_dma_driver = { - .remove = __exit_p(at_dma_remove), + .remove = at_dma_remove, .shutdown = at_dma_shutdown, .id_table = atdma_devtypes, .driver = { diff --git a/drivers/dma/at_hdmac_regs.h b/drivers/dma/at_hdmac_regs.h index 0eb3c1388667..c604d26fd4d3 100644 --- a/drivers/dma/at_hdmac_regs.h +++ b/drivers/dma/at_hdmac_regs.h @@ -220,6 +220,8 @@ enum atc_status { * @device: parent device * @ch_regs: memory mapped register base * @mask: channel index in a mask + * @per_if: peripheral interface + * @mem_if: memory interface * @status: transmit status information from irq/prep* functions * to tasklet (use atomic operations) * @tasklet: bottom half to finish transaction work @@ -238,6 +240,8 @@ struct at_dma_chan { struct at_dma *device; void __iomem *ch_regs; u8 mask; + u8 per_if; + u8 mem_if; unsigned long status; struct tasklet_struct tasklet; u32 save_cfg; diff --git a/drivers/dma/coh901318.c b/drivers/dma/coh901318.c index 797940e532ff..3b23061cdb41 100644 --- a/drivers/dma/coh901318.c +++ b/drivers/dma/coh901318.c @@ -2748,7 +2748,7 @@ static int __init coh901318_probe(struct platform_device *pdev) return err; } -static int __exit coh901318_remove(struct platform_device *pdev) +static int coh901318_remove(struct platform_device *pdev) { struct coh901318_base *base = platform_get_drvdata(pdev); @@ -2760,7 +2760,7 @@ static int __exit coh901318_remove(struct platform_device *pdev) static struct platform_driver coh901318_driver = { - .remove = __exit_p(coh901318_remove), + .remove = coh901318_remove, .driver = { .name = "coh901318", }, diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c index b2728d6ba2fd..93f7992bee5c 100644 --- a/drivers/dma/dmaengine.c +++ b/drivers/dma/dmaengine.c @@ -62,6 +62,8 @@ #include #include #include +#include +#include #include static DEFINE_MUTEX(dma_list_mutex); @@ -174,7 +176,8 @@ static struct class dma_devclass = { #define dma_device_satisfies_mask(device, mask) \ __dma_device_satisfies_mask((device), &(mask)) static int -__dma_device_satisfies_mask(struct dma_device *device, dma_cap_mask_t *want) +__dma_device_satisfies_mask(struct dma_device *device, + const dma_cap_mask_t *want) { dma_cap_mask_t has; @@ -463,7 +466,8 @@ static void dma_channel_rebalance(void) } } -static struct dma_chan *private_candidate(dma_cap_mask_t *mask, struct dma_device *dev, +static struct dma_chan *private_candidate(const dma_cap_mask_t *mask, + struct dma_device *dev, dma_filter_fn fn, void *fn_param) { struct dma_chan *chan; @@ -505,7 +509,8 @@ static struct dma_chan *private_candidate(dma_cap_mask_t *mask, struct dma_devic * @fn: optional callback to disposition available channels * @fn_param: opaque parameter to pass to dma_filter_fn */ -struct dma_chan *__dma_request_channel(dma_cap_mask_t *mask, dma_filter_fn fn, void *fn_param) +struct dma_chan *__dma_request_channel(const dma_cap_mask_t *mask, + dma_filter_fn fn, void *fn_param) { struct dma_device *device, *_d; struct dma_chan *chan = NULL; @@ -555,12 +560,16 @@ EXPORT_SYMBOL_GPL(__dma_request_channel); * @dev: pointer to client device structure * @name: slave channel name */ -struct dma_chan *dma_request_slave_channel(struct device *dev, char *name) +struct dma_chan *dma_request_slave_channel(struct device *dev, const char *name) { /* If device-tree is present get slave info from here */ if (dev->of_node) return of_dma_request_slave_channel(dev->of_node, name); + /* If device was enumerated by ACPI get slave info from here */ + if (ACPI_HANDLE(dev)) + return acpi_dma_request_slave_chan_by_name(dev, name); + return NULL; } EXPORT_SYMBOL_GPL(dma_request_slave_channel); diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c index a2c8904b63ea..d8ce4ecfef18 100644 --- a/drivers/dma/dmatest.c +++ b/drivers/dma/dmatest.c @@ -2,6 +2,7 @@ * DMA Engine test module * * Copyright (C) 2007 Atmel Corporation + * Copyright (C) 2013 Intel Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -18,6 +19,10 @@ #include #include #include +#include +#include +#include +#include static unsigned int test_buf_size = 16384; module_param(test_buf_size, uint, S_IRUGO); @@ -61,6 +66,9 @@ module_param(timeout, uint, S_IRUGO); MODULE_PARM_DESC(timeout, "Transfer Timeout in msec (default: 3000), " "Pass -1 for infinite timeout"); +/* Maximum amount of mismatched bytes in buffer to print */ +#define MAX_ERROR_COUNT 32 + /* * Initialization patterns. All bytes in the source buffer has bit 7 * set, all bytes in the destination buffer has bit 7 cleared. @@ -78,13 +86,65 @@ MODULE_PARM_DESC(timeout, "Transfer Timeout in msec (default: 3000), " #define PATTERN_OVERWRITE 0x20 #define PATTERN_COUNT_MASK 0x1f +enum dmatest_error_type { + DMATEST_ET_OK, + DMATEST_ET_MAP_SRC, + DMATEST_ET_MAP_DST, + DMATEST_ET_PREP, + DMATEST_ET_SUBMIT, + DMATEST_ET_TIMEOUT, + DMATEST_ET_DMA_ERROR, + DMATEST_ET_DMA_IN_PROGRESS, + DMATEST_ET_VERIFY, + DMATEST_ET_VERIFY_BUF, +}; + +struct dmatest_verify_buffer { + unsigned int index; + u8 expected; + u8 actual; +}; + +struct dmatest_verify_result { + unsigned int error_count; + struct dmatest_verify_buffer data[MAX_ERROR_COUNT]; + u8 pattern; + bool is_srcbuf; +}; + +struct dmatest_thread_result { + struct list_head node; + unsigned int n; + unsigned int src_off; + unsigned int dst_off; + unsigned int len; + enum dmatest_error_type type; + union { + unsigned long data; + dma_cookie_t cookie; + enum dma_status status; + int error; + struct dmatest_verify_result *vr; + }; +}; + +struct dmatest_result { + struct list_head node; + char *name; + struct list_head results; +}; + +struct dmatest_info; + struct dmatest_thread { struct list_head node; + struct dmatest_info *info; struct task_struct *task; struct dma_chan *chan; u8 **srcs; u8 **dsts; enum dma_transaction_type type; + bool done; }; struct dmatest_chan { @@ -93,25 +153,69 @@ struct dmatest_chan { struct list_head threads; }; -/* - * These are protected by dma_list_mutex since they're only used by - * the DMA filter function callback +/** + * struct dmatest_params - test parameters. + * @buf_size: size of the memcpy test buffer + * @channel: bus ID of the channel to test + * @device: bus ID of the DMA Engine to test + * @threads_per_chan: number of threads to start per channel + * @max_channels: maximum number of channels to use + * @iterations: iterations before stopping test + * @xor_sources: number of xor source buffers + * @pq_sources: number of p+q source buffers + * @timeout: transfer timeout in msec, -1 for infinite timeout */ -static LIST_HEAD(dmatest_channels); -static unsigned int nr_channels; +struct dmatest_params { + unsigned int buf_size; + char channel[20]; + char device[20]; + unsigned int threads_per_chan; + unsigned int max_channels; + unsigned int iterations; + unsigned int xor_sources; + unsigned int pq_sources; + int timeout; +}; -static bool dmatest_match_channel(struct dma_chan *chan) +/** + * struct dmatest_info - test information. + * @params: test parameters + * @lock: access protection to the fields of this structure + */ +struct dmatest_info { + /* Test parameters */ + struct dmatest_params params; + + /* Internal state */ + struct list_head channels; + unsigned int nr_channels; + struct mutex lock; + + /* debugfs related stuff */ + struct dentry *root; + struct dmatest_params dbgfs_params; + + /* Test results */ + struct list_head results; + struct mutex results_lock; +}; + +static struct dmatest_info test_info; + +static bool dmatest_match_channel(struct dmatest_params *params, + struct dma_chan *chan) { - if (test_channel[0] == '\0') + if (params->channel[0] == '\0') return true; - return strcmp(dma_chan_name(chan), test_channel) == 0; + return strcmp(dma_chan_name(chan), params->channel) == 0; } -static bool dmatest_match_device(struct dma_device *device) +static bool dmatest_match_device(struct dmatest_params *params, + struct dma_device *device) { - if (test_device[0] == '\0') + if (params->device[0] == '\0') return true; - return strcmp(dev_name(device->dev), test_device) == 0; + return strcmp(dev_name(device->dev), params->device) == 0; } static unsigned long dmatest_random(void) @@ -122,7 +226,8 @@ static unsigned long dmatest_random(void) return buf; } -static void dmatest_init_srcs(u8 **bufs, unsigned int start, unsigned int len) +static void dmatest_init_srcs(u8 **bufs, unsigned int start, unsigned int len, + unsigned int buf_size) { unsigned int i; u8 *buf; @@ -133,13 +238,14 @@ static void dmatest_init_srcs(u8 **bufs, unsigned int start, unsigned int len) for ( ; i < start + len; i++) buf[i] = PATTERN_SRC | PATTERN_COPY | (~i & PATTERN_COUNT_MASK); - for ( ; i < test_buf_size; i++) + for ( ; i < buf_size; i++) buf[i] = PATTERN_SRC | (~i & PATTERN_COUNT_MASK); buf++; } } -static void dmatest_init_dsts(u8 **bufs, unsigned int start, unsigned int len) +static void dmatest_init_dsts(u8 **bufs, unsigned int start, unsigned int len, + unsigned int buf_size) { unsigned int i; u8 *buf; @@ -150,40 +256,14 @@ static void dmatest_init_dsts(u8 **bufs, unsigned int start, unsigned int len) for ( ; i < start + len; i++) buf[i] = PATTERN_DST | PATTERN_OVERWRITE | (~i & PATTERN_COUNT_MASK); - for ( ; i < test_buf_size; i++) + for ( ; i < buf_size; i++) buf[i] = PATTERN_DST | (~i & PATTERN_COUNT_MASK); } } -static void dmatest_mismatch(u8 actual, u8 pattern, unsigned int index, - unsigned int counter, bool is_srcbuf) -{ - u8 diff = actual ^ pattern; - u8 expected = pattern | (~counter & PATTERN_COUNT_MASK); - const char *thread_name = current->comm; - - if (is_srcbuf) - pr_warning("%s: srcbuf[0x%x] overwritten!" - " Expected %02x, got %02x\n", - thread_name, index, expected, actual); - else if ((pattern & PATTERN_COPY) - && (diff & (PATTERN_COPY | PATTERN_OVERWRITE))) - pr_warning("%s: dstbuf[0x%x] not copied!" - " Expected %02x, got %02x\n", - thread_name, index, expected, actual); - else if (diff & PATTERN_SRC) - pr_warning("%s: dstbuf[0x%x] was copied!" - " Expected %02x, got %02x\n", - thread_name, index, expected, actual); - else - pr_warning("%s: dstbuf[0x%x] mismatch!" - " Expected %02x, got %02x\n", - thread_name, index, expected, actual); -} - -static unsigned int dmatest_verify(u8 **bufs, unsigned int start, - unsigned int end, unsigned int counter, u8 pattern, - bool is_srcbuf) +static unsigned int dmatest_verify(struct dmatest_verify_result *vr, u8 **bufs, + unsigned int start, unsigned int end, unsigned int counter, + u8 pattern, bool is_srcbuf) { unsigned int i; unsigned int error_count = 0; @@ -191,6 +271,7 @@ static unsigned int dmatest_verify(u8 **bufs, unsigned int start, u8 expected; u8 *buf; unsigned int counter_orig = counter; + struct dmatest_verify_buffer *vb; for (; (buf = *bufs); bufs++) { counter = counter_orig; @@ -198,18 +279,21 @@ static unsigned int dmatest_verify(u8 **bufs, unsigned int start, actual = buf[i]; expected = pattern | (~counter & PATTERN_COUNT_MASK); if (actual != expected) { - if (error_count < 32) - dmatest_mismatch(actual, pattern, i, - counter, is_srcbuf); + if (error_count < MAX_ERROR_COUNT && vr) { + vb = &vr->data[error_count]; + vb->index = i; + vb->expected = expected; + vb->actual = actual; + } error_count++; } counter++; } } - if (error_count > 32) + if (error_count > MAX_ERROR_COUNT) pr_warning("%s: %u errors suppressed\n", - current->comm, error_count - 32); + current->comm, error_count - MAX_ERROR_COUNT); return error_count; } @@ -249,6 +333,170 @@ static unsigned int min_odd(unsigned int x, unsigned int y) return val % 2 ? val : val - 1; } +static char *verify_result_get_one(struct dmatest_verify_result *vr, + unsigned int i) +{ + struct dmatest_verify_buffer *vb = &vr->data[i]; + u8 diff = vb->actual ^ vr->pattern; + static char buf[512]; + char *msg; + + if (vr->is_srcbuf) + msg = "srcbuf overwritten!"; + else if ((vr->pattern & PATTERN_COPY) + && (diff & (PATTERN_COPY | PATTERN_OVERWRITE))) + msg = "dstbuf not copied!"; + else if (diff & PATTERN_SRC) + msg = "dstbuf was copied!"; + else + msg = "dstbuf mismatch!"; + + snprintf(buf, sizeof(buf) - 1, "%s [0x%x] Expected %02x, got %02x", msg, + vb->index, vb->expected, vb->actual); + + return buf; +} + +static char *thread_result_get(const char *name, + struct dmatest_thread_result *tr) +{ + static const char * const messages[] = { + [DMATEST_ET_OK] = "No errors", + [DMATEST_ET_MAP_SRC] = "src mapping error", + [DMATEST_ET_MAP_DST] = "dst mapping error", + [DMATEST_ET_PREP] = "prep error", + [DMATEST_ET_SUBMIT] = "submit error", + [DMATEST_ET_TIMEOUT] = "test timed out", + [DMATEST_ET_DMA_ERROR] = + "got completion callback (DMA_ERROR)", + [DMATEST_ET_DMA_IN_PROGRESS] = + "got completion callback (DMA_IN_PROGRESS)", + [DMATEST_ET_VERIFY] = "errors", + [DMATEST_ET_VERIFY_BUF] = "verify errors", + }; + static char buf[512]; + + snprintf(buf, sizeof(buf) - 1, + "%s: #%u: %s with src_off=0x%x ""dst_off=0x%x len=0x%x (%lu)", + name, tr->n, messages[tr->type], tr->src_off, tr->dst_off, + tr->len, tr->data); + + return buf; +} + +static int thread_result_add(struct dmatest_info *info, + struct dmatest_result *r, enum dmatest_error_type type, + unsigned int n, unsigned int src_off, unsigned int dst_off, + unsigned int len, unsigned long data) +{ + struct dmatest_thread_result *tr; + + tr = kzalloc(sizeof(*tr), GFP_KERNEL); + if (!tr) + return -ENOMEM; + + tr->type = type; + tr->n = n; + tr->src_off = src_off; + tr->dst_off = dst_off; + tr->len = len; + tr->data = data; + + mutex_lock(&info->results_lock); + list_add_tail(&tr->node, &r->results); + mutex_unlock(&info->results_lock); + + pr_warn("%s\n", thread_result_get(r->name, tr)); + return 0; +} + +static unsigned int verify_result_add(struct dmatest_info *info, + struct dmatest_result *r, unsigned int n, + unsigned int src_off, unsigned int dst_off, unsigned int len, + u8 **bufs, int whence, unsigned int counter, u8 pattern, + bool is_srcbuf) +{ + struct dmatest_verify_result *vr; + unsigned int error_count; + unsigned int buf_off = is_srcbuf ? src_off : dst_off; + unsigned int start, end; + + if (whence < 0) { + start = 0; + end = buf_off; + } else if (whence > 0) { + start = buf_off + len; + end = info->params.buf_size; + } else { + start = buf_off; + end = buf_off + len; + } + + vr = kmalloc(sizeof(*vr), GFP_KERNEL); + if (!vr) { + pr_warn("dmatest: No memory to store verify result\n"); + return dmatest_verify(NULL, bufs, start, end, counter, pattern, + is_srcbuf); + } + + vr->pattern = pattern; + vr->is_srcbuf = is_srcbuf; + + error_count = dmatest_verify(vr, bufs, start, end, counter, pattern, + is_srcbuf); + if (error_count) { + vr->error_count = error_count; + thread_result_add(info, r, DMATEST_ET_VERIFY_BUF, n, src_off, + dst_off, len, (unsigned long)vr); + return error_count; + } + + kfree(vr); + return 0; +} + +static void result_free(struct dmatest_info *info, const char *name) +{ + struct dmatest_result *r, *_r; + + mutex_lock(&info->results_lock); + list_for_each_entry_safe(r, _r, &info->results, node) { + struct dmatest_thread_result *tr, *_tr; + + if (name && strcmp(r->name, name)) + continue; + + list_for_each_entry_safe(tr, _tr, &r->results, node) { + if (tr->type == DMATEST_ET_VERIFY_BUF) + kfree(tr->vr); + list_del(&tr->node); + kfree(tr); + } + + kfree(r->name); + list_del(&r->node); + kfree(r); + } + + mutex_unlock(&info->results_lock); +} + +static struct dmatest_result *result_init(struct dmatest_info *info, + const char *name) +{ + struct dmatest_result *r; + + r = kzalloc(sizeof(*r), GFP_KERNEL); + if (r) { + r->name = kstrdup(name, GFP_KERNEL); + INIT_LIST_HEAD(&r->results); + mutex_lock(&info->results_lock); + list_add_tail(&r->node, &info->results); + mutex_unlock(&info->results_lock); + } + return r; +} + /* * This function repeatedly tests DMA transfers of various lengths and * offsets for a given operation type until it is told to exit by @@ -268,6 +516,8 @@ static int dmatest_func(void *data) DECLARE_WAIT_QUEUE_HEAD_ONSTACK(done_wait); struct dmatest_thread *thread = data; struct dmatest_done done = { .wait = &done_wait }; + struct dmatest_info *info; + struct dmatest_params *params; struct dma_chan *chan; struct dma_device *dev; const char *thread_name; @@ -278,11 +528,12 @@ static int dmatest_func(void *data) dma_cookie_t cookie; enum dma_status status; enum dma_ctrl_flags flags; - u8 pq_coefs[pq_sources + 1]; + u8 *pq_coefs = NULL; int ret; int src_cnt; int dst_cnt; int i; + struct dmatest_result *result; thread_name = current->comm; set_freezable(); @@ -290,28 +541,39 @@ static int dmatest_func(void *data) ret = -ENOMEM; smp_rmb(); + info = thread->info; + params = &info->params; chan = thread->chan; dev = chan->device; if (thread->type == DMA_MEMCPY) src_cnt = dst_cnt = 1; else if (thread->type == DMA_XOR) { /* force odd to ensure dst = src */ - src_cnt = min_odd(xor_sources | 1, dev->max_xor); + src_cnt = min_odd(params->xor_sources | 1, dev->max_xor); dst_cnt = 1; } else if (thread->type == DMA_PQ) { /* force odd to ensure dst = src */ - src_cnt = min_odd(pq_sources | 1, dma_maxpq(dev, 0)); + src_cnt = min_odd(params->pq_sources | 1, dma_maxpq(dev, 0)); dst_cnt = 2; + + pq_coefs = kmalloc(params->pq_sources+1, GFP_KERNEL); + if (!pq_coefs) + goto err_thread_type; + for (i = 0; i < src_cnt; i++) pq_coefs[i] = 1; } else + goto err_thread_type; + + result = result_init(info, thread_name); + if (!result) goto err_srcs; thread->srcs = kcalloc(src_cnt+1, sizeof(u8 *), GFP_KERNEL); if (!thread->srcs) goto err_srcs; for (i = 0; i < src_cnt; i++) { - thread->srcs[i] = kmalloc(test_buf_size, GFP_KERNEL); + thread->srcs[i] = kmalloc(params->buf_size, GFP_KERNEL); if (!thread->srcs[i]) goto err_srcbuf; } @@ -321,7 +583,7 @@ static int dmatest_func(void *data) if (!thread->dsts) goto err_dsts; for (i = 0; i < dst_cnt; i++) { - thread->dsts[i] = kmalloc(test_buf_size, GFP_KERNEL); + thread->dsts[i] = kmalloc(params->buf_size, GFP_KERNEL); if (!thread->dsts[i]) goto err_dstbuf; } @@ -337,7 +599,7 @@ static int dmatest_func(void *data) | DMA_COMPL_SKIP_DEST_UNMAP | DMA_COMPL_SRC_UNMAP_SINGLE; while (!kthread_should_stop() - && !(iterations && total_tests >= iterations)) { + && !(params->iterations && total_tests >= params->iterations)) { struct dma_async_tx_descriptor *tx = NULL; dma_addr_t dma_srcs[src_cnt]; dma_addr_t dma_dsts[dst_cnt]; @@ -353,24 +615,24 @@ static int dmatest_func(void *data) else if (thread->type == DMA_PQ) align = dev->pq_align; - if (1 << align > test_buf_size) { + if (1 << align > params->buf_size) { pr_err("%u-byte buffer too small for %d-byte alignment\n", - test_buf_size, 1 << align); + params->buf_size, 1 << align); break; } - len = dmatest_random() % test_buf_size + 1; + len = dmatest_random() % params->buf_size + 1; len = (len >> align) << align; if (!len) len = 1 << align; - src_off = dmatest_random() % (test_buf_size - len + 1); - dst_off = dmatest_random() % (test_buf_size - len + 1); + src_off = dmatest_random() % (params->buf_size - len + 1); + dst_off = dmatest_random() % (params->buf_size - len + 1); src_off = (src_off >> align) << align; dst_off = (dst_off >> align) << align; - dmatest_init_srcs(thread->srcs, src_off, len); - dmatest_init_dsts(thread->dsts, dst_off, len); + dmatest_init_srcs(thread->srcs, src_off, len, params->buf_size); + dmatest_init_dsts(thread->dsts, dst_off, len, params->buf_size); for (i = 0; i < src_cnt; i++) { u8 *buf = thread->srcs[i] + src_off; @@ -380,10 +642,10 @@ static int dmatest_func(void *data) ret = dma_mapping_error(dev->dev, dma_srcs[i]); if (ret) { unmap_src(dev->dev, dma_srcs, len, i); - pr_warn("%s: #%u: mapping error %d with " - "src_off=0x%x len=0x%x\n", - thread_name, total_tests - 1, ret, - src_off, len); + thread_result_add(info, result, + DMATEST_ET_MAP_SRC, + total_tests, src_off, dst_off, + len, ret); failed_tests++; continue; } @@ -391,16 +653,17 @@ static int dmatest_func(void *data) /* map with DMA_BIDIRECTIONAL to force writeback/invalidate */ for (i = 0; i < dst_cnt; i++) { dma_dsts[i] = dma_map_single(dev->dev, thread->dsts[i], - test_buf_size, + params->buf_size, DMA_BIDIRECTIONAL); ret = dma_mapping_error(dev->dev, dma_dsts[i]); if (ret) { unmap_src(dev->dev, dma_srcs, len, src_cnt); - unmap_dst(dev->dev, dma_dsts, test_buf_size, i); - pr_warn("%s: #%u: mapping error %d with " - "dst_off=0x%x len=0x%x\n", - thread_name, total_tests - 1, ret, - dst_off, test_buf_size); + unmap_dst(dev->dev, dma_dsts, params->buf_size, + i); + thread_result_add(info, result, + DMATEST_ET_MAP_DST, + total_tests, src_off, dst_off, + len, ret); failed_tests++; continue; } @@ -428,11 +691,11 @@ static int dmatest_func(void *data) if (!tx) { unmap_src(dev->dev, dma_srcs, len, src_cnt); - unmap_dst(dev->dev, dma_dsts, test_buf_size, dst_cnt); - pr_warning("%s: #%u: prep error with src_off=0x%x " - "dst_off=0x%x len=0x%x\n", - thread_name, total_tests - 1, - src_off, dst_off, len); + unmap_dst(dev->dev, dma_dsts, params->buf_size, + dst_cnt); + thread_result_add(info, result, DMATEST_ET_PREP, + total_tests, src_off, dst_off, + len, 0); msleep(100); failed_tests++; continue; @@ -444,18 +707,18 @@ static int dmatest_func(void *data) cookie = tx->tx_submit(tx); if (dma_submit_error(cookie)) { - pr_warning("%s: #%u: submit error %d with src_off=0x%x " - "dst_off=0x%x len=0x%x\n", - thread_name, total_tests - 1, cookie, - src_off, dst_off, len); + thread_result_add(info, result, DMATEST_ET_SUBMIT, + total_tests, src_off, dst_off, + len, cookie); msleep(100); failed_tests++; continue; } dma_async_issue_pending(chan); - wait_event_freezable_timeout(done_wait, done.done, - msecs_to_jiffies(timeout)); + wait_event_freezable_timeout(done_wait, + done.done || kthread_should_stop(), + msecs_to_jiffies(params->timeout)); status = dma_async_is_tx_complete(chan, cookie, NULL, NULL); @@ -468,56 +731,57 @@ static int dmatest_func(void *data) * free it this time?" dancing. For now, just * leave it dangling. */ - pr_warning("%s: #%u: test timed out\n", - thread_name, total_tests - 1); + thread_result_add(info, result, DMATEST_ET_TIMEOUT, + total_tests, src_off, dst_off, + len, 0); failed_tests++; continue; } else if (status != DMA_SUCCESS) { - pr_warning("%s: #%u: got completion callback," - " but status is \'%s\'\n", - thread_name, total_tests - 1, - status == DMA_ERROR ? "error" : "in progress"); + enum dmatest_error_type type = (status == DMA_ERROR) ? + DMATEST_ET_DMA_ERROR : DMATEST_ET_DMA_IN_PROGRESS; + thread_result_add(info, result, type, + total_tests, src_off, dst_off, + len, status); failed_tests++; continue; } /* Unmap by myself (see DMA_COMPL_SKIP_DEST_UNMAP above) */ - unmap_dst(dev->dev, dma_dsts, test_buf_size, dst_cnt); + unmap_dst(dev->dev, dma_dsts, params->buf_size, dst_cnt); error_count = 0; pr_debug("%s: verifying source buffer...\n", thread_name); - error_count += dmatest_verify(thread->srcs, 0, src_off, + error_count += verify_result_add(info, result, total_tests, + src_off, dst_off, len, thread->srcs, -1, 0, PATTERN_SRC, true); - error_count += dmatest_verify(thread->srcs, src_off, - src_off + len, src_off, - PATTERN_SRC | PATTERN_COPY, true); - error_count += dmatest_verify(thread->srcs, src_off + len, - test_buf_size, src_off + len, - PATTERN_SRC, true); - - pr_debug("%s: verifying dest buffer...\n", - thread->task->comm); - error_count += dmatest_verify(thread->dsts, 0, dst_off, + error_count += verify_result_add(info, result, total_tests, + src_off, dst_off, len, thread->srcs, 0, + src_off, PATTERN_SRC | PATTERN_COPY, true); + error_count += verify_result_add(info, result, total_tests, + src_off, dst_off, len, thread->srcs, 1, + src_off + len, PATTERN_SRC, true); + + pr_debug("%s: verifying dest buffer...\n", thread_name); + error_count += verify_result_add(info, result, total_tests, + src_off, dst_off, len, thread->dsts, -1, 0, PATTERN_DST, false); - error_count += dmatest_verify(thread->dsts, dst_off, - dst_off + len, src_off, - PATTERN_SRC | PATTERN_COPY, false); - error_count += dmatest_verify(thread->dsts, dst_off + len, - test_buf_size, dst_off + len, - PATTERN_DST, false); + error_count += verify_result_add(info, result, total_tests, + src_off, dst_off, len, thread->dsts, 0, + src_off, PATTERN_SRC | PATTERN_COPY, false); + error_count += verify_result_add(info, result, total_tests, + src_off, dst_off, len, thread->dsts, 1, + dst_off + len, PATTERN_DST, false); if (error_count) { - pr_warning("%s: #%u: %u errors with " - "src_off=0x%x dst_off=0x%x len=0x%x\n", - thread_name, total_tests - 1, error_count, - src_off, dst_off, len); + thread_result_add(info, result, DMATEST_ET_VERIFY, + total_tests, src_off, dst_off, + len, error_count); failed_tests++; } else { - pr_debug("%s: #%u: No errors with " - "src_off=0x%x dst_off=0x%x len=0x%x\n", - thread_name, total_tests - 1, - src_off, dst_off, len); + thread_result_add(info, result, DMATEST_ET_OK, + total_tests, src_off, dst_off, + len, 0); } } @@ -532,6 +796,8 @@ err_dsts: err_srcbuf: kfree(thread->srcs); err_srcs: + kfree(pq_coefs); +err_thread_type: pr_notice("%s: terminating after %u tests, %u failures (status %d)\n", thread_name, total_tests, failed_tests, ret); @@ -539,7 +805,9 @@ err_srcs: if (ret) dmaengine_terminate_all(chan); - if (iterations > 0) + thread->done = true; + + if (params->iterations > 0) while (!kthread_should_stop()) { DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wait_dmatest_exit); interruptible_sleep_on(&wait_dmatest_exit); @@ -568,8 +836,10 @@ static void dmatest_cleanup_channel(struct dmatest_chan *dtc) kfree(dtc); } -static int dmatest_add_threads(struct dmatest_chan *dtc, enum dma_transaction_type type) +static int dmatest_add_threads(struct dmatest_info *info, + struct dmatest_chan *dtc, enum dma_transaction_type type) { + struct dmatest_params *params = &info->params; struct dmatest_thread *thread; struct dma_chan *chan = dtc->chan; char *op; @@ -584,7 +854,7 @@ static int dmatest_add_threads(struct dmatest_chan *dtc, enum dma_transaction_ty else return -EINVAL; - for (i = 0; i < threads_per_chan; i++) { + for (i = 0; i < params->threads_per_chan; i++) { thread = kzalloc(sizeof(struct dmatest_thread), GFP_KERNEL); if (!thread) { pr_warning("dmatest: No memory for %s-%s%u\n", @@ -592,6 +862,7 @@ static int dmatest_add_threads(struct dmatest_chan *dtc, enum dma_transaction_ty break; } + thread->info = info; thread->chan = dtc->chan; thread->type = type; smp_wmb(); @@ -612,7 +883,8 @@ static int dmatest_add_threads(struct dmatest_chan *dtc, enum dma_transaction_ty return i; } -static int dmatest_add_channel(struct dma_chan *chan) +static int dmatest_add_channel(struct dmatest_info *info, + struct dma_chan *chan) { struct dmatest_chan *dtc; struct dma_device *dma_dev = chan->device; @@ -629,75 +901,418 @@ static int dmatest_add_channel(struct dma_chan *chan) INIT_LIST_HEAD(&dtc->threads); if (dma_has_cap(DMA_MEMCPY, dma_dev->cap_mask)) { - cnt = dmatest_add_threads(dtc, DMA_MEMCPY); + cnt = dmatest_add_threads(info, dtc, DMA_MEMCPY); thread_count += cnt > 0 ? cnt : 0; } if (dma_has_cap(DMA_XOR, dma_dev->cap_mask)) { - cnt = dmatest_add_threads(dtc, DMA_XOR); + cnt = dmatest_add_threads(info, dtc, DMA_XOR); thread_count += cnt > 0 ? cnt : 0; } if (dma_has_cap(DMA_PQ, dma_dev->cap_mask)) { - cnt = dmatest_add_threads(dtc, DMA_PQ); + cnt = dmatest_add_threads(info, dtc, DMA_PQ); thread_count += cnt > 0 ? cnt : 0; } pr_info("dmatest: Started %u threads using %s\n", thread_count, dma_chan_name(chan)); - list_add_tail(&dtc->node, &dmatest_channels); - nr_channels++; + list_add_tail(&dtc->node, &info->channels); + info->nr_channels++; return 0; } static bool filter(struct dma_chan *chan, void *param) { - if (!dmatest_match_channel(chan) || !dmatest_match_device(chan->device)) + struct dmatest_params *params = param; + + if (!dmatest_match_channel(params, chan) || + !dmatest_match_device(params, chan->device)) return false; else return true; } -static int __init dmatest_init(void) +static int __run_threaded_test(struct dmatest_info *info) { dma_cap_mask_t mask; struct dma_chan *chan; + struct dmatest_params *params = &info->params; int err = 0; dma_cap_zero(mask); dma_cap_set(DMA_MEMCPY, mask); for (;;) { - chan = dma_request_channel(mask, filter, NULL); + chan = dma_request_channel(mask, filter, params); if (chan) { - err = dmatest_add_channel(chan); + err = dmatest_add_channel(info, chan); if (err) { dma_release_channel(chan); break; /* add_channel failed, punt */ } } else break; /* no more channels available */ - if (max_channels && nr_channels >= max_channels) + if (params->max_channels && + info->nr_channels >= params->max_channels) break; /* we have all we need */ } - return err; } -/* when compiled-in wait for drivers to load first */ -late_initcall(dmatest_init); -static void __exit dmatest_exit(void) +#ifndef MODULE +static int run_threaded_test(struct dmatest_info *info) +{ + int ret; + + mutex_lock(&info->lock); + ret = __run_threaded_test(info); + mutex_unlock(&info->lock); + return ret; +} +#endif + +static void __stop_threaded_test(struct dmatest_info *info) { struct dmatest_chan *dtc, *_dtc; struct dma_chan *chan; - list_for_each_entry_safe(dtc, _dtc, &dmatest_channels, node) { + list_for_each_entry_safe(dtc, _dtc, &info->channels, node) { list_del(&dtc->node); chan = dtc->chan; dmatest_cleanup_channel(dtc); - pr_debug("dmatest: dropped channel %s\n", - dma_chan_name(chan)); + pr_debug("dmatest: dropped channel %s\n", dma_chan_name(chan)); dma_release_channel(chan); } + + info->nr_channels = 0; +} + +static void stop_threaded_test(struct dmatest_info *info) +{ + mutex_lock(&info->lock); + __stop_threaded_test(info); + mutex_unlock(&info->lock); +} + +static int __restart_threaded_test(struct dmatest_info *info, bool run) +{ + struct dmatest_params *params = &info->params; + int ret; + + /* Stop any running test first */ + __stop_threaded_test(info); + + if (run == false) + return 0; + + /* Clear results from previous run */ + result_free(info, NULL); + + /* Copy test parameters */ + memcpy(params, &info->dbgfs_params, sizeof(*params)); + + /* Run test with new parameters */ + ret = __run_threaded_test(info); + if (ret) { + __stop_threaded_test(info); + pr_err("dmatest: Can't run test\n"); + } + + return ret; +} + +static ssize_t dtf_write_string(void *to, size_t available, loff_t *ppos, + const void __user *from, size_t count) +{ + char tmp[20]; + ssize_t len; + + len = simple_write_to_buffer(tmp, sizeof(tmp) - 1, ppos, from, count); + if (len >= 0) { + tmp[len] = '\0'; + strlcpy(to, strim(tmp), available); + } + + return len; +} + +static ssize_t dtf_read_channel(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct dmatest_info *info = file->private_data; + return simple_read_from_buffer(buf, count, ppos, + info->dbgfs_params.channel, + strlen(info->dbgfs_params.channel)); +} + +static ssize_t dtf_write_channel(struct file *file, const char __user *buf, + size_t size, loff_t *ppos) +{ + struct dmatest_info *info = file->private_data; + return dtf_write_string(info->dbgfs_params.channel, + sizeof(info->dbgfs_params.channel), + ppos, buf, size); +} + +static const struct file_operations dtf_channel_fops = { + .read = dtf_read_channel, + .write = dtf_write_channel, + .open = simple_open, + .llseek = default_llseek, +}; + +static ssize_t dtf_read_device(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct dmatest_info *info = file->private_data; + return simple_read_from_buffer(buf, count, ppos, + info->dbgfs_params.device, + strlen(info->dbgfs_params.device)); +} + +static ssize_t dtf_write_device(struct file *file, const char __user *buf, + size_t size, loff_t *ppos) +{ + struct dmatest_info *info = file->private_data; + return dtf_write_string(info->dbgfs_params.device, + sizeof(info->dbgfs_params.device), + ppos, buf, size); +} + +static const struct file_operations dtf_device_fops = { + .read = dtf_read_device, + .write = dtf_write_device, + .open = simple_open, + .llseek = default_llseek, +}; + +static ssize_t dtf_read_run(struct file *file, char __user *user_buf, + size_t count, loff_t *ppos) +{ + struct dmatest_info *info = file->private_data; + char buf[3]; + struct dmatest_chan *dtc; + bool alive = false; + + mutex_lock(&info->lock); + list_for_each_entry(dtc, &info->channels, node) { + struct dmatest_thread *thread; + + list_for_each_entry(thread, &dtc->threads, node) { + if (!thread->done) { + alive = true; + break; + } + } + } + + if (alive) { + buf[0] = 'Y'; + } else { + __stop_threaded_test(info); + buf[0] = 'N'; + } + + mutex_unlock(&info->lock); + buf[1] = '\n'; + buf[2] = 0x00; + return simple_read_from_buffer(user_buf, count, ppos, buf, 2); +} + +static ssize_t dtf_write_run(struct file *file, const char __user *user_buf, + size_t count, loff_t *ppos) +{ + struct dmatest_info *info = file->private_data; + char buf[16]; + bool bv; + int ret = 0; + + if (copy_from_user(buf, user_buf, min(count, (sizeof(buf) - 1)))) + return -EFAULT; + + if (strtobool(buf, &bv) == 0) { + mutex_lock(&info->lock); + ret = __restart_threaded_test(info, bv); + mutex_unlock(&info->lock); + } + + return ret ? ret : count; +} + +static const struct file_operations dtf_run_fops = { + .read = dtf_read_run, + .write = dtf_write_run, + .open = simple_open, + .llseek = default_llseek, +}; + +static int dtf_results_show(struct seq_file *sf, void *data) +{ + struct dmatest_info *info = sf->private; + struct dmatest_result *result; + struct dmatest_thread_result *tr; + unsigned int i; + + mutex_lock(&info->results_lock); + list_for_each_entry(result, &info->results, node) { + list_for_each_entry(tr, &result->results, node) { + seq_printf(sf, "%s\n", + thread_result_get(result->name, tr)); + if (tr->type == DMATEST_ET_VERIFY_BUF) { + for (i = 0; i < tr->vr->error_count; i++) { + seq_printf(sf, "\t%s\n", + verify_result_get_one(tr->vr, i)); + } + } + } + } + + mutex_unlock(&info->results_lock); + return 0; +} + +static int dtf_results_open(struct inode *inode, struct file *file) +{ + return single_open(file, dtf_results_show, inode->i_private); +} + +static const struct file_operations dtf_results_fops = { + .open = dtf_results_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static int dmatest_register_dbgfs(struct dmatest_info *info) +{ + struct dentry *d; + struct dmatest_params *params = &info->dbgfs_params; + int ret = -ENOMEM; + + d = debugfs_create_dir("dmatest", NULL); + if (IS_ERR(d)) + return PTR_ERR(d); + if (!d) + goto err_root; + + info->root = d; + + /* Copy initial values */ + memcpy(params, &info->params, sizeof(*params)); + + /* Test parameters */ + + d = debugfs_create_u32("test_buf_size", S_IWUSR | S_IRUGO, info->root, + (u32 *)¶ms->buf_size); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + d = debugfs_create_file("channel", S_IRUGO | S_IWUSR, info->root, + info, &dtf_channel_fops); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + d = debugfs_create_file("device", S_IRUGO | S_IWUSR, info->root, + info, &dtf_device_fops); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + d = debugfs_create_u32("threads_per_chan", S_IWUSR | S_IRUGO, info->root, + (u32 *)¶ms->threads_per_chan); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + d = debugfs_create_u32("max_channels", S_IWUSR | S_IRUGO, info->root, + (u32 *)¶ms->max_channels); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + d = debugfs_create_u32("iterations", S_IWUSR | S_IRUGO, info->root, + (u32 *)¶ms->iterations); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + d = debugfs_create_u32("xor_sources", S_IWUSR | S_IRUGO, info->root, + (u32 *)¶ms->xor_sources); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + d = debugfs_create_u32("pq_sources", S_IWUSR | S_IRUGO, info->root, + (u32 *)¶ms->pq_sources); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + d = debugfs_create_u32("timeout", S_IWUSR | S_IRUGO, info->root, + (u32 *)¶ms->timeout); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + /* Run or stop threaded test */ + d = debugfs_create_file("run", S_IWUSR | S_IRUGO, info->root, + info, &dtf_run_fops); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + /* Results of test in progress */ + d = debugfs_create_file("results", S_IRUGO, info->root, info, + &dtf_results_fops); + if (IS_ERR_OR_NULL(d)) + goto err_node; + + return 0; + +err_node: + debugfs_remove_recursive(info->root); +err_root: + pr_err("dmatest: Failed to initialize debugfs\n"); + return ret; +} + +static int __init dmatest_init(void) +{ + struct dmatest_info *info = &test_info; + struct dmatest_params *params = &info->params; + int ret; + + memset(info, 0, sizeof(*info)); + + mutex_init(&info->lock); + INIT_LIST_HEAD(&info->channels); + + mutex_init(&info->results_lock); + INIT_LIST_HEAD(&info->results); + + /* Set default parameters */ + params->buf_size = test_buf_size; + strlcpy(params->channel, test_channel, sizeof(params->channel)); + strlcpy(params->device, test_device, sizeof(params->device)); + params->threads_per_chan = threads_per_chan; + params->max_channels = max_channels; + params->iterations = iterations; + params->xor_sources = xor_sources; + params->pq_sources = pq_sources; + params->timeout = timeout; + + ret = dmatest_register_dbgfs(info); + if (ret) + return ret; + +#ifdef MODULE + return 0; +#else + return run_threaded_test(info); +#endif +} +/* when compiled-in wait for drivers to load first */ +late_initcall(dmatest_init); + +static void __exit dmatest_exit(void) +{ + struct dmatest_info *info = &test_info; + + debugfs_remove_recursive(info->root); + stop_threaded_test(info); + result_free(info, NULL); } module_exit(dmatest_exit); diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c index 43a5329d4483..2e5deaa82b60 100644 --- a/drivers/dma/dw_dmac.c +++ b/drivers/dma/dw_dmac.c @@ -25,6 +25,8 @@ #include #include #include +#include +#include #include "dw_dmac_regs.h" #include "dmaengine.h" @@ -49,29 +51,22 @@ static inline unsigned int dwc_get_sms(struct dw_dma_slave *slave) return slave ? slave->src_master : 1; } -#define SRC_MASTER 0 -#define DST_MASTER 1 - -static inline unsigned int dwc_get_master(struct dma_chan *chan, int master) +static inline void dwc_set_masters(struct dw_dma_chan *dwc) { - struct dw_dma *dw = to_dw_dma(chan->device); - struct dw_dma_slave *dws = chan->private; - unsigned int m; - - if (master == SRC_MASTER) - m = dwc_get_sms(dws); - else - m = dwc_get_dms(dws); + struct dw_dma *dw = to_dw_dma(dwc->chan.device); + struct dw_dma_slave *dws = dwc->chan.private; + unsigned char mmax = dw->nr_masters - 1; - return min_t(unsigned int, dw->nr_masters - 1, m); + if (dwc->request_line == ~0) { + dwc->src_master = min_t(unsigned char, mmax, dwc_get_sms(dws)); + dwc->dst_master = min_t(unsigned char, mmax, dwc_get_dms(dws)); + } } #define DWC_DEFAULT_CTLLO(_chan) ({ \ struct dw_dma_chan *_dwc = to_dw_dma_chan(_chan); \ struct dma_slave_config *_sconfig = &_dwc->dma_sconfig; \ bool _is_slave = is_slave_direction(_dwc->direction); \ - int _dms = dwc_get_master(_chan, DST_MASTER); \ - int _sms = dwc_get_master(_chan, SRC_MASTER); \ u8 _smsize = _is_slave ? _sconfig->src_maxburst : \ DW_DMA_MSIZE_16; \ u8 _dmsize = _is_slave ? _sconfig->dst_maxburst : \ @@ -81,8 +76,8 @@ static inline unsigned int dwc_get_master(struct dma_chan *chan, int master) | DWC_CTLL_SRC_MSIZE(_smsize) \ | DWC_CTLL_LLP_D_EN \ | DWC_CTLL_LLP_S_EN \ - | DWC_CTLL_DMS(_dms) \ - | DWC_CTLL_SMS(_sms)); \ + | DWC_CTLL_DMS(_dwc->dst_master) \ + | DWC_CTLL_SMS(_dwc->src_master)); \ }) /* @@ -92,13 +87,6 @@ static inline unsigned int dwc_get_master(struct dma_chan *chan, int master) */ #define NR_DESCS_PER_CHANNEL 64 -static inline unsigned int dwc_get_data_width(struct dma_chan *chan, int master) -{ - struct dw_dma *dw = to_dw_dma(chan->device); - - return dw->data_width[dwc_get_master(chan, master)]; -} - /*----------------------------------------------------------------------*/ static struct device *chan2dev(struct dma_chan *chan) @@ -172,13 +160,7 @@ static void dwc_initialize(struct dw_dma_chan *dwc) if (dwc->initialized == true) return; - if (dws && dws->cfg_hi == ~0 && dws->cfg_lo == ~0) { - /* autoconfigure based on request line from DT */ - if (dwc->direction == DMA_MEM_TO_DEV) - cfghi = DWC_CFGH_DST_PER(dwc->request_line); - else if (dwc->direction == DMA_DEV_TO_MEM) - cfghi = DWC_CFGH_SRC_PER(dwc->request_line); - } else if (dws) { + if (dws) { /* * We need controller-specific data to set up slave * transfers. @@ -189,9 +171,9 @@ static void dwc_initialize(struct dw_dma_chan *dwc) cfglo |= dws->cfg_lo & ~DWC_CFGL_CH_PRIOR_MASK; } else { if (dwc->direction == DMA_MEM_TO_DEV) - cfghi = DWC_CFGH_DST_PER(dwc->dma_sconfig.slave_id); + cfghi = DWC_CFGH_DST_PER(dwc->request_line); else if (dwc->direction == DMA_DEV_TO_MEM) - cfghi = DWC_CFGH_SRC_PER(dwc->dma_sconfig.slave_id); + cfghi = DWC_CFGH_SRC_PER(dwc->request_line); } channel_writel(dwc, CFG_LO, cfglo); @@ -473,16 +455,16 @@ static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc) (unsigned long long)llp); list_for_each_entry_safe(desc, _desc, &dwc->active_list, desc_node) { - /* initial residue value */ + /* Initial residue value */ dwc->residue = desc->total_len; - /* check first descriptors addr */ + /* Check first descriptors addr */ if (desc->txd.phys == llp) { spin_unlock_irqrestore(&dwc->lock, flags); return; } - /* check first descriptors llp */ + /* Check first descriptors llp */ if (desc->lli.llp == llp) { /* This one is currently in progress */ dwc->residue -= dwc_get_sent(dwc); @@ -588,7 +570,7 @@ inline dma_addr_t dw_dma_get_dst_addr(struct dma_chan *chan) } EXPORT_SYMBOL(dw_dma_get_dst_addr); -/* called with dwc->lock held and all DMAC interrupts disabled */ +/* Called with dwc->lock held and all DMAC interrupts disabled */ static void dwc_handle_cyclic(struct dw_dma *dw, struct dw_dma_chan *dwc, u32 status_err, u32 status_xfer) { @@ -626,7 +608,7 @@ static void dwc_handle_cyclic(struct dw_dma *dw, struct dw_dma_chan *dwc, dwc_chan_disable(dw, dwc); - /* make sure DMA does not restart by loading a new list */ + /* Make sure DMA does not restart by loading a new list */ channel_writel(dwc, LLP, 0); channel_writel(dwc, CTL_LO, 0); channel_writel(dwc, CTL_HI, 0); @@ -745,6 +727,7 @@ dwc_prep_dma_memcpy(struct dma_chan *chan, dma_addr_t dest, dma_addr_t src, size_t len, unsigned long flags) { struct dw_dma_chan *dwc = to_dw_dma_chan(chan); + struct dw_dma *dw = to_dw_dma(chan->device); struct dw_desc *desc; struct dw_desc *first; struct dw_desc *prev; @@ -767,8 +750,8 @@ dwc_prep_dma_memcpy(struct dma_chan *chan, dma_addr_t dest, dma_addr_t src, dwc->direction = DMA_MEM_TO_MEM; - data_width = min_t(unsigned int, dwc_get_data_width(chan, SRC_MASTER), - dwc_get_data_width(chan, DST_MASTER)); + data_width = min_t(unsigned int, dw->data_width[dwc->src_master], + dw->data_width[dwc->dst_master]); src_width = dst_width = min_t(unsigned int, data_width, dwc_fast_fls(src | dest | len)); @@ -826,6 +809,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, unsigned long flags, void *context) { struct dw_dma_chan *dwc = to_dw_dma_chan(chan); + struct dw_dma *dw = to_dw_dma(chan->device); struct dma_slave_config *sconfig = &dwc->dma_sconfig; struct dw_desc *prev; struct dw_desc *first; @@ -859,7 +843,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, ctllo |= sconfig->device_fc ? DWC_CTLL_FC(DW_DMA_FC_P_M2P) : DWC_CTLL_FC(DW_DMA_FC_D_M2P); - data_width = dwc_get_data_width(chan, SRC_MASTER); + data_width = dw->data_width[dwc->src_master]; for_each_sg(sgl, sg, sg_len, i) { struct dw_desc *desc; @@ -919,7 +903,7 @@ slave_sg_todev_fill_desc: ctllo |= sconfig->device_fc ? DWC_CTLL_FC(DW_DMA_FC_P_P2M) : DWC_CTLL_FC(DW_DMA_FC_D_P2M); - data_width = dwc_get_data_width(chan, DST_MASTER); + data_width = dw->data_width[dwc->dst_master]; for_each_sg(sgl, sg, sg_len, i) { struct dw_desc *desc; @@ -1001,13 +985,6 @@ static inline void convert_burst(u32 *maxburst) *maxburst = 0; } -static inline void convert_slave_id(struct dw_dma_chan *dwc) -{ - struct dw_dma *dw = to_dw_dma(dwc->chan.device); - - dwc->dma_sconfig.slave_id -= dw->request_line_base; -} - static int set_runtime_config(struct dma_chan *chan, struct dma_slave_config *sconfig) { @@ -1020,9 +997,12 @@ set_runtime_config(struct dma_chan *chan, struct dma_slave_config *sconfig) memcpy(&dwc->dma_sconfig, sconfig, sizeof(*sconfig)); dwc->direction = sconfig->direction; + /* Take the request line from slave_id member */ + if (dwc->request_line == ~0) + dwc->request_line = sconfig->slave_id; + convert_burst(&dwc->dma_sconfig.src_maxburst); convert_burst(&dwc->dma_sconfig.dst_maxburst); - convert_slave_id(dwc); return 0; } @@ -1030,10 +1010,11 @@ set_runtime_config(struct dma_chan *chan, struct dma_slave_config *sconfig) static inline void dwc_chan_pause(struct dw_dma_chan *dwc) { u32 cfglo = channel_readl(dwc, CFG_LO); + unsigned int count = 20; /* timeout iterations */ channel_writel(dwc, CFG_LO, cfglo | DWC_CFGL_CH_SUSP); - while (!(channel_readl(dwc, CFG_LO) & DWC_CFGL_FIFO_EMPTY)) - cpu_relax(); + while (!(channel_readl(dwc, CFG_LO) & DWC_CFGL_FIFO_EMPTY) && count--) + udelay(2); dwc->paused = true; } @@ -1169,6 +1150,8 @@ static int dwc_alloc_chan_resources(struct dma_chan *chan) * doesn't mean what you think it means), and status writeback. */ + dwc_set_masters(dwc); + spin_lock_irqsave(&dwc->lock, flags); i = dwc->descs_allocated; while (dwc->descs_allocated < NR_DESCS_PER_CHANNEL) { @@ -1226,6 +1209,7 @@ static void dwc_free_chan_resources(struct dma_chan *chan) list_splice_init(&dwc->free_list, &list); dwc->descs_allocated = 0; dwc->initialized = false; + dwc->request_line = ~0; /* Disable interrupts */ channel_clear_bit(dw, MASK.XFER, dwc->mask); @@ -1241,42 +1225,36 @@ static void dwc_free_chan_resources(struct dma_chan *chan) dev_vdbg(chan2dev(chan), "%s: done\n", __func__); } -struct dw_dma_filter_args { +/*----------------------------------------------------------------------*/ + +struct dw_dma_of_filter_args { struct dw_dma *dw; unsigned int req; unsigned int src; unsigned int dst; }; -static bool dw_dma_generic_filter(struct dma_chan *chan, void *param) +static bool dw_dma_of_filter(struct dma_chan *chan, void *param) { struct dw_dma_chan *dwc = to_dw_dma_chan(chan); - struct dw_dma *dw = to_dw_dma(chan->device); - struct dw_dma_filter_args *fargs = param; - struct dw_dma_slave *dws = &dwc->slave; + struct dw_dma_of_filter_args *fargs = param; - /* ensure the device matches our channel */ + /* Ensure the device matches our channel */ if (chan->device != &fargs->dw->dma) return false; - dws->dma_dev = dw->dma.dev; - dws->cfg_hi = ~0; - dws->cfg_lo = ~0; - dws->src_master = fargs->src; - dws->dst_master = fargs->dst; - dwc->request_line = fargs->req; - - chan->private = dws; + dwc->src_master = fargs->src; + dwc->dst_master = fargs->dst; return true; } -static struct dma_chan *dw_dma_xlate(struct of_phandle_args *dma_spec, - struct of_dma *ofdma) +static struct dma_chan *dw_dma_of_xlate(struct of_phandle_args *dma_spec, + struct of_dma *ofdma) { struct dw_dma *dw = ofdma->of_dma_data; - struct dw_dma_filter_args fargs = { + struct dw_dma_of_filter_args fargs = { .dw = dw, }; dma_cap_mask_t cap; @@ -1297,8 +1275,48 @@ static struct dma_chan *dw_dma_xlate(struct of_phandle_args *dma_spec, dma_cap_set(DMA_SLAVE, cap); /* TODO: there should be a simpler way to do this */ - return dma_request_channel(cap, dw_dma_generic_filter, &fargs); + return dma_request_channel(cap, dw_dma_of_filter, &fargs); +} + +#ifdef CONFIG_ACPI +static bool dw_dma_acpi_filter(struct dma_chan *chan, void *param) +{ + struct dw_dma_chan *dwc = to_dw_dma_chan(chan); + struct acpi_dma_spec *dma_spec = param; + + if (chan->device->dev != dma_spec->dev || + chan->chan_id != dma_spec->chan_id) + return false; + + dwc->request_line = dma_spec->slave_id; + dwc->src_master = dwc_get_sms(NULL); + dwc->dst_master = dwc_get_dms(NULL); + + return true; +} + +static void dw_dma_acpi_controller_register(struct dw_dma *dw) +{ + struct device *dev = dw->dma.dev; + struct acpi_dma_filter_info *info; + int ret; + + info = devm_kzalloc(dev, sizeof(*info), GFP_KERNEL); + if (!info) + return; + + dma_cap_zero(info->dma_cap); + dma_cap_set(DMA_SLAVE, info->dma_cap); + info->filter_fn = dw_dma_acpi_filter; + + ret = devm_acpi_dma_controller_register(dev, acpi_dma_simple_xlate, + info); + if (ret) + dev_err(dev, "could not register acpi_dma_controller\n"); } +#else /* !CONFIG_ACPI */ +static inline void dw_dma_acpi_controller_register(struct dw_dma *dw) {} +#endif /* !CONFIG_ACPI */ /* --------------------- Cyclic DMA API extensions -------------------- */ @@ -1322,7 +1340,7 @@ int dw_dma_cyclic_start(struct dma_chan *chan) spin_lock_irqsave(&dwc->lock, flags); - /* assert channel is idle */ + /* Assert channel is idle */ if (dma_readl(dw, CH_EN) & dwc->mask) { dev_err(chan2dev(&dwc->chan), "BUG: Attempted to start non-idle channel\n"); @@ -1334,7 +1352,7 @@ int dw_dma_cyclic_start(struct dma_chan *chan) dma_writel(dw, CLEAR.ERROR, dwc->mask); dma_writel(dw, CLEAR.XFER, dwc->mask); - /* setup DMAC channel registers */ + /* Setup DMAC channel registers */ channel_writel(dwc, LLP, dwc->cdesc->desc[0]->txd.phys); channel_writel(dwc, CTL_LO, DWC_CTLL_LLP_D_EN | DWC_CTLL_LLP_S_EN); channel_writel(dwc, CTL_HI, 0); @@ -1501,7 +1519,7 @@ struct dw_cyclic_desc *dw_dma_cyclic_prep(struct dma_chan *chan, last = desc; } - /* lets make a cyclic list */ + /* Let's make a cyclic list */ last->lli.llp = cdesc->desc[0]->txd.phys; dev_dbg(chan2dev(&dwc->chan), "cyclic prepared buf 0x%llx len %zu " @@ -1636,7 +1654,6 @@ dw_dma_parse_dt(struct platform_device *pdev) static int dw_probe(struct platform_device *pdev) { - const struct platform_device_id *match; struct dw_dma_platform_data *pdata; struct resource *io; struct dw_dma *dw; @@ -1706,7 +1723,7 @@ static int dw_probe(struct platform_device *pdev) dw->regs = regs; - /* get hardware configuration parameters */ + /* Get hardware configuration parameters */ if (autocfg) { max_blk_size = dma_readl(dw, MAX_BLK_SIZE); @@ -1720,18 +1737,13 @@ static int dw_probe(struct platform_device *pdev) memcpy(dw->data_width, pdata->data_width, 4); } - /* Get the base request line if set */ - match = platform_get_device_id(pdev); - if (match) - dw->request_line_base = (unsigned int)match->driver_data; - /* Calculate all channel mask before DMA setup */ dw->all_chan_mask = (1 << nr_channels) - 1; - /* force dma off, just in case */ + /* Force dma off, just in case */ dw_dma_off(dw); - /* disable BLOCK interrupts as well */ + /* Disable BLOCK interrupts as well */ channel_clear_bit(dw, MASK.BLOCK, dw->all_chan_mask); err = devm_request_irq(&pdev->dev, irq, dw_dma_interrupt, 0, @@ -1741,7 +1753,7 @@ static int dw_probe(struct platform_device *pdev) platform_set_drvdata(pdev, dw); - /* create a pool of consistent memory blocks for hardware descriptors */ + /* Create a pool of consistent memory blocks for hardware descriptors */ dw->desc_pool = dmam_pool_create("dw_dmac_desc_pool", &pdev->dev, sizeof(struct dw_desc), 4, 0); if (!dw->desc_pool) { @@ -1781,8 +1793,9 @@ static int dw_probe(struct platform_device *pdev) channel_clear_bit(dw, CH_EN, dwc->mask); dwc->direction = DMA_TRANS_NONE; + dwc->request_line = ~0; - /* hardware configuration */ + /* Hardware configuration */ if (autocfg) { unsigned int dwc_params; @@ -1842,12 +1855,15 @@ static int dw_probe(struct platform_device *pdev) if (pdev->dev.of_node) { err = of_dma_controller_register(pdev->dev.of_node, - dw_dma_xlate, dw); - if (err && err != -ENODEV) + dw_dma_of_xlate, dw); + if (err) dev_err(&pdev->dev, "could not register of_dma_controller\n"); } + if (ACPI_HANDLE(&pdev->dev)) + dw_dma_acpi_controller_register(dw); + return 0; } @@ -1912,18 +1928,19 @@ static const struct dev_pm_ops dw_dev_pm_ops = { }; #ifdef CONFIG_OF -static const struct of_device_id dw_dma_id_table[] = { +static const struct of_device_id dw_dma_of_id_table[] = { { .compatible = "snps,dma-spear1340" }, {} }; -MODULE_DEVICE_TABLE(of, dw_dma_id_table); +MODULE_DEVICE_TABLE(of, dw_dma_of_id_table); #endif -static const struct platform_device_id dw_dma_ids[] = { - /* Name, Request Line Base */ - { "INTL9C60", (kernel_ulong_t)16 }, +#ifdef CONFIG_ACPI +static const struct acpi_device_id dw_dma_acpi_id_table[] = { + { "INTL9C60", 0 }, { } }; +#endif static struct platform_driver dw_driver = { .probe = dw_probe, @@ -1932,9 +1949,9 @@ static struct platform_driver dw_driver = { .driver = { .name = "dw_dmac", .pm = &dw_dev_pm_ops, - .of_match_table = of_match_ptr(dw_dma_id_table), + .of_match_table = of_match_ptr(dw_dma_of_id_table), + .acpi_match_table = ACPI_PTR(dw_dma_acpi_id_table), }, - .id_table = dw_dma_ids, }; static int __init dw_init(void) diff --git a/drivers/dma/dw_dmac_regs.h b/drivers/dma/dw_dmac_regs.h index 4d02c3669b75..9d417200bd57 100644 --- a/drivers/dma/dw_dmac_regs.h +++ b/drivers/dma/dw_dmac_regs.h @@ -212,8 +212,11 @@ struct dw_dma_chan { /* hardware configuration */ unsigned int block_size; bool nollp; + + /* custom slave configuration */ unsigned int request_line; - struct dw_dma_slave slave; + unsigned char src_master; + unsigned char dst_master; /* configuration passed via DMA_SLAVE_CONFIG */ struct dma_slave_config dma_sconfig; @@ -247,7 +250,6 @@ struct dw_dma { /* hardware configuration */ unsigned char nr_masters; unsigned char data_width[4]; - unsigned int request_line_base; struct dw_dma_chan chan[0]; }; diff --git a/drivers/dma/imx-dma.c b/drivers/dma/imx-dma.c index 70b8975d107e..f28583370d00 100644 --- a/drivers/dma/imx-dma.c +++ b/drivers/dma/imx-dma.c @@ -859,8 +859,7 @@ static struct dma_async_tx_descriptor *imxdma_prep_dma_cyclic( desc = list_first_entry(&imxdmac->ld_free, struct imxdma_desc, node); - if (imxdmac->sg_list) - kfree(imxdmac->sg_list); + kfree(imxdmac->sg_list); imxdmac->sg_list = kcalloc(periods + 1, sizeof(struct scatterlist), GFP_KERNEL); @@ -1145,7 +1144,7 @@ err: return ret; } -static int __exit imxdma_remove(struct platform_device *pdev) +static int imxdma_remove(struct platform_device *pdev) { struct imxdma_engine *imxdma = platform_get_drvdata(pdev); @@ -1162,7 +1161,7 @@ static struct platform_driver imxdma_driver = { .name = "imx-dma", }, .id_table = imx_dma_devtype, - .remove = __exit_p(imxdma_remove), + .remove = imxdma_remove, }; static int __init imxdma_module_init(void) diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index f082aa3a918c..092867bf795c 100644 --- a/drivers/dma/imx-sdma.c +++ b/drivers/dma/imx-sdma.c @@ -1462,7 +1462,7 @@ err_irq: return ret; } -static int __exit sdma_remove(struct platform_device *pdev) +static int sdma_remove(struct platform_device *pdev) { return -EBUSY; } @@ -1473,7 +1473,7 @@ static struct platform_driver sdma_driver = { .of_match_table = sdma_dt_ids, }, .id_table = sdma_devtypes, - .remove = __exit_p(sdma_remove), + .remove = sdma_remove, }; static int __init sdma_module_init(void) diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c index 1879a5942bfc..17a2393b3e25 100644 --- a/drivers/dma/ioat/dma.c +++ b/drivers/dma/ioat/dma.c @@ -892,7 +892,7 @@ MODULE_PARM_DESC(ioat_interrupt_style, * ioat_dma_setup_interrupts - setup interrupt handler * @device: ioat device */ -static int ioat_dma_setup_interrupts(struct ioatdma_device *device) +int ioat_dma_setup_interrupts(struct ioatdma_device *device) { struct ioat_chan_common *chan; struct pci_dev *pdev = device->pdev; @@ -941,6 +941,7 @@ msix: } } intrctrl |= IOAT_INTRCTRL_MSIX_VECTOR_CONTROL; + device->irq_mode = IOAT_MSIX; goto done; msix_single_vector: @@ -956,6 +957,7 @@ msix_single_vector: pci_disable_msix(pdev); goto msi; } + device->irq_mode = IOAT_MSIX_SINGLE; goto done; msi: @@ -969,6 +971,7 @@ msi: pci_disable_msi(pdev); goto intx; } + device->irq_mode = IOAT_MSIX; goto done; intx: @@ -977,6 +980,7 @@ intx: if (err) goto err_no_irq; + device->irq_mode = IOAT_INTX; done: if (device->intr_quirk) device->intr_quirk(device); @@ -987,9 +991,11 @@ done: err_no_irq: /* Disable all interrupt generation */ writeb(0, device->reg_base + IOAT_INTRCTRL_OFFSET); + device->irq_mode = IOAT_NOIRQ; dev_err(dev, "no usable interrupts\n"); return err; } +EXPORT_SYMBOL(ioat_dma_setup_interrupts); static void ioat_disable_interrupts(struct ioatdma_device *device) { diff --git a/drivers/dma/ioat/dma.h b/drivers/dma/ioat/dma.h index 53a4cbb78f47..54fb7b9ff9aa 100644 --- a/drivers/dma/ioat/dma.h +++ b/drivers/dma/ioat/dma.h @@ -39,6 +39,7 @@ #define to_ioat_desc(lh) container_of(lh, struct ioat_desc_sw, node) #define tx_to_ioat_desc(tx) container_of(tx, struct ioat_desc_sw, txd) #define to_dev(ioat_chan) (&(ioat_chan)->device->pdev->dev) +#define to_pdev(ioat_chan) ((ioat_chan)->device->pdev) #define chan_num(ch) ((int)((ch)->reg_base - (ch)->device->reg_base) / 0x80) @@ -48,6 +49,14 @@ */ #define NULL_DESC_BUFFER_SIZE 1 +enum ioat_irq_mode { + IOAT_NOIRQ = 0, + IOAT_MSIX, + IOAT_MSIX_SINGLE, + IOAT_MSI, + IOAT_INTX +}; + /** * struct ioatdma_device - internal representation of a IOAT device * @pdev: PCI-Express device @@ -72,11 +81,16 @@ struct ioatdma_device { void __iomem *reg_base; struct pci_pool *dma_pool; struct pci_pool *completion_pool; +#define MAX_SED_POOLS 5 + struct dma_pool *sed_hw_pool[MAX_SED_POOLS]; + struct kmem_cache *sed_pool; struct dma_device common; u8 version; struct msix_entry msix_entries[4]; struct ioat_chan_common *idx[4]; struct dca_provider *dca; + enum ioat_irq_mode irq_mode; + u32 cap; void (*intr_quirk)(struct ioatdma_device *device); int (*enumerate_channels)(struct ioatdma_device *device); int (*reset_hw)(struct ioat_chan_common *chan); @@ -131,6 +145,20 @@ struct ioat_dma_chan { u16 active; }; +/** + * struct ioat_sed_ent - wrapper around super extended hardware descriptor + * @hw: hardware SED + * @sed_dma: dma address for the SED + * @list: list member + * @parent: point to the dma descriptor that's the parent + */ +struct ioat_sed_ent { + struct ioat_sed_raw_descriptor *hw; + dma_addr_t dma; + struct ioat_ring_ent *parent; + unsigned int hw_pool; +}; + static inline struct ioat_chan_common *to_chan_common(struct dma_chan *c) { return container_of(c, struct ioat_chan_common, common); @@ -179,7 +207,7 @@ __dump_desc_dbg(struct ioat_chan_common *chan, struct ioat_dma_descriptor *hw, struct device *dev = to_dev(chan); dev_dbg(dev, "desc[%d]: (%#llx->%#llx) cookie: %d flags: %#x" - " ctl: %#x (op: %d int_en: %d compl: %d)\n", id, + " ctl: %#10.8x (op: %#x int_en: %d compl: %d)\n", id, (unsigned long long) tx->phys, (unsigned long long) hw->next, tx->cookie, tx->flags, hw->ctl, hw->ctl_f.op, hw->ctl_f.int_en, hw->ctl_f.compl_write); @@ -201,7 +229,7 @@ ioat_chan_by_index(struct ioatdma_device *device, int index) return device->idx[index]; } -static inline u64 ioat_chansts(struct ioat_chan_common *chan) +static inline u64 ioat_chansts_32(struct ioat_chan_common *chan) { u8 ver = chan->device->version; u64 status; @@ -218,6 +246,26 @@ static inline u64 ioat_chansts(struct ioat_chan_common *chan) return status; } +#if BITS_PER_LONG == 64 + +static inline u64 ioat_chansts(struct ioat_chan_common *chan) +{ + u8 ver = chan->device->version; + u64 status; + + /* With IOAT v3.3 the status register is 64bit. */ + if (ver >= IOAT_VER_3_3) + status = readq(chan->reg_base + IOAT_CHANSTS_OFFSET(ver)); + else + status = ioat_chansts_32(chan); + + return status; +} + +#else +#define ioat_chansts ioat_chansts_32 +#endif + static inline void ioat_start(struct ioat_chan_common *chan) { u8 ver = chan->device->version; @@ -321,6 +369,7 @@ bool ioat_cleanup_preamble(struct ioat_chan_common *chan, dma_addr_t *phys_complete); void ioat_kobject_add(struct ioatdma_device *device, struct kobj_type *type); void ioat_kobject_del(struct ioatdma_device *device); +int ioat_dma_setup_interrupts(struct ioatdma_device *device); extern const struct sysfs_ops ioat_sysfs_ops; extern struct ioat_sysfs_entry ioat_version_attr; extern struct ioat_sysfs_entry ioat_cap_attr; diff --git a/drivers/dma/ioat/dma_v2.h b/drivers/dma/ioat/dma_v2.h index e100f644e344..29bf9448035d 100644 --- a/drivers/dma/ioat/dma_v2.h +++ b/drivers/dma/ioat/dma_v2.h @@ -137,6 +137,7 @@ struct ioat_ring_ent { #ifdef DEBUG int id; #endif + struct ioat_sed_ent *sed; }; static inline struct ioat_ring_ent * @@ -157,6 +158,7 @@ static inline void ioat2_set_chainaddr(struct ioat2_dma_chan *ioat, u64 addr) int ioat2_dma_probe(struct ioatdma_device *dev, int dca); int ioat3_dma_probe(struct ioatdma_device *dev, int dca); +void ioat3_dma_remove(struct ioatdma_device *dev); struct dca_provider *ioat2_dca_init(struct pci_dev *pdev, void __iomem *iobase); struct dca_provider *ioat3_dca_init(struct pci_dev *pdev, void __iomem *iobase); int ioat2_check_space_lock(struct ioat2_dma_chan *ioat, int num_descs); diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c index e8336cce360b..ca6ea9b3551b 100644 --- a/drivers/dma/ioat/dma_v3.c +++ b/drivers/dma/ioat/dma_v3.c @@ -55,7 +55,7 @@ /* * Support routines for v3+ hardware */ - +#include #include #include #include @@ -70,6 +70,10 @@ /* ioat hardware assumes at least two sources for raid operations */ #define src_cnt_to_sw(x) ((x) + 2) #define src_cnt_to_hw(x) ((x) - 2) +#define ndest_to_sw(x) ((x) + 1) +#define ndest_to_hw(x) ((x) - 1) +#define src16_cnt_to_sw(x) ((x) + 9) +#define src16_cnt_to_hw(x) ((x) - 9) /* provide a lookup table for setting the source address in the base or * extended descriptor of an xor or pq descriptor @@ -77,7 +81,20 @@ static const u8 xor_idx_to_desc = 0xe0; static const u8 xor_idx_to_field[] = { 1, 4, 5, 6, 7, 0, 1, 2 }; static const u8 pq_idx_to_desc = 0xf8; +static const u8 pq16_idx_to_desc[] = { 0, 0, 1, 1, 1, 1, 1, 1, 1, + 2, 2, 2, 2, 2, 2, 2 }; static const u8 pq_idx_to_field[] = { 1, 4, 5, 0, 1, 2, 4, 5 }; +static const u8 pq16_idx_to_field[] = { 1, 4, 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6 }; + +/* + * technically sources 1 and 2 do not require SED, but the op will have + * at least 9 descriptors so that's irrelevant. + */ +static const u8 pq16_idx_to_sed[] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, + 1, 1, 1, 1, 1, 1, 1 }; + +static void ioat3_eh(struct ioat2_dma_chan *ioat); static dma_addr_t xor_get_src(struct ioat_raw_descriptor *descs[2], int idx) { @@ -101,6 +118,13 @@ static dma_addr_t pq_get_src(struct ioat_raw_descriptor *descs[2], int idx) return raw->field[pq_idx_to_field[idx]]; } +static dma_addr_t pq16_get_src(struct ioat_raw_descriptor *desc[3], int idx) +{ + struct ioat_raw_descriptor *raw = desc[pq16_idx_to_desc[idx]]; + + return raw->field[pq16_idx_to_field[idx]]; +} + static void pq_set_src(struct ioat_raw_descriptor *descs[2], dma_addr_t addr, u32 offset, u8 coef, int idx) { @@ -111,6 +135,167 @@ static void pq_set_src(struct ioat_raw_descriptor *descs[2], pq->coef[idx] = coef; } +static int sed_get_pq16_pool_idx(int src_cnt) +{ + + return pq16_idx_to_sed[src_cnt]; +} + +static bool is_jf_ioat(struct pci_dev *pdev) +{ + switch (pdev->device) { + case PCI_DEVICE_ID_INTEL_IOAT_JSF0: + case PCI_DEVICE_ID_INTEL_IOAT_JSF1: + case PCI_DEVICE_ID_INTEL_IOAT_JSF2: + case PCI_DEVICE_ID_INTEL_IOAT_JSF3: + case PCI_DEVICE_ID_INTEL_IOAT_JSF4: + case PCI_DEVICE_ID_INTEL_IOAT_JSF5: + case PCI_DEVICE_ID_INTEL_IOAT_JSF6: + case PCI_DEVICE_ID_INTEL_IOAT_JSF7: + case PCI_DEVICE_ID_INTEL_IOAT_JSF8: + case PCI_DEVICE_ID_INTEL_IOAT_JSF9: + return true; + default: + return false; + } +} + +static bool is_snb_ioat(struct pci_dev *pdev) +{ + switch (pdev->device) { + case PCI_DEVICE_ID_INTEL_IOAT_SNB0: + case PCI_DEVICE_ID_INTEL_IOAT_SNB1: + case PCI_DEVICE_ID_INTEL_IOAT_SNB2: + case PCI_DEVICE_ID_INTEL_IOAT_SNB3: + case PCI_DEVICE_ID_INTEL_IOAT_SNB4: + case PCI_DEVICE_ID_INTEL_IOAT_SNB5: + case PCI_DEVICE_ID_INTEL_IOAT_SNB6: + case PCI_DEVICE_ID_INTEL_IOAT_SNB7: + case PCI_DEVICE_ID_INTEL_IOAT_SNB8: + case PCI_DEVICE_ID_INTEL_IOAT_SNB9: + return true; + default: + return false; + } +} + +static bool is_ivb_ioat(struct pci_dev *pdev) +{ + switch (pdev->device) { + case PCI_DEVICE_ID_INTEL_IOAT_IVB0: + case PCI_DEVICE_ID_INTEL_IOAT_IVB1: + case PCI_DEVICE_ID_INTEL_IOAT_IVB2: + case PCI_DEVICE_ID_INTEL_IOAT_IVB3: + case PCI_DEVICE_ID_INTEL_IOAT_IVB4: + case PCI_DEVICE_ID_INTEL_IOAT_IVB5: + case PCI_DEVICE_ID_INTEL_IOAT_IVB6: + case PCI_DEVICE_ID_INTEL_IOAT_IVB7: + case PCI_DEVICE_ID_INTEL_IOAT_IVB8: + case PCI_DEVICE_ID_INTEL_IOAT_IVB9: + return true; + default: + return false; + } + +} + +static bool is_hsw_ioat(struct pci_dev *pdev) +{ + switch (pdev->device) { + case PCI_DEVICE_ID_INTEL_IOAT_HSW0: + case PCI_DEVICE_ID_INTEL_IOAT_HSW1: + case PCI_DEVICE_ID_INTEL_IOAT_HSW2: + case PCI_DEVICE_ID_INTEL_IOAT_HSW3: + case PCI_DEVICE_ID_INTEL_IOAT_HSW4: + case PCI_DEVICE_ID_INTEL_IOAT_HSW5: + case PCI_DEVICE_ID_INTEL_IOAT_HSW6: + case PCI_DEVICE_ID_INTEL_IOAT_HSW7: + case PCI_DEVICE_ID_INTEL_IOAT_HSW8: + case PCI_DEVICE_ID_INTEL_IOAT_HSW9: + return true; + default: + return false; + } + +} + +static bool is_xeon_cb32(struct pci_dev *pdev) +{ + return is_jf_ioat(pdev) || is_snb_ioat(pdev) || is_ivb_ioat(pdev) || + is_hsw_ioat(pdev); +} + +static bool is_bwd_ioat(struct pci_dev *pdev) +{ + switch (pdev->device) { + case PCI_DEVICE_ID_INTEL_IOAT_BWD0: + case PCI_DEVICE_ID_INTEL_IOAT_BWD1: + case PCI_DEVICE_ID_INTEL_IOAT_BWD2: + case PCI_DEVICE_ID_INTEL_IOAT_BWD3: + return true; + default: + return false; + } +} + +static bool is_bwd_noraid(struct pci_dev *pdev) +{ + switch (pdev->device) { + case PCI_DEVICE_ID_INTEL_IOAT_BWD2: + case PCI_DEVICE_ID_INTEL_IOAT_BWD3: + return true; + default: + return false; + } + +} + +static void pq16_set_src(struct ioat_raw_descriptor *desc[3], + dma_addr_t addr, u32 offset, u8 coef, int idx) +{ + struct ioat_pq_descriptor *pq = (struct ioat_pq_descriptor *)desc[0]; + struct ioat_pq16a_descriptor *pq16 = + (struct ioat_pq16a_descriptor *)desc[1]; + struct ioat_raw_descriptor *raw = desc[pq16_idx_to_desc[idx]]; + + raw->field[pq16_idx_to_field[idx]] = addr + offset; + + if (idx < 8) + pq->coef[idx] = coef; + else + pq16->coef[idx - 8] = coef; +} + +static struct ioat_sed_ent * +ioat3_alloc_sed(struct ioatdma_device *device, unsigned int hw_pool) +{ + struct ioat_sed_ent *sed; + gfp_t flags = __GFP_ZERO | GFP_ATOMIC; + + sed = kmem_cache_alloc(device->sed_pool, flags); + if (!sed) + return NULL; + + sed->hw_pool = hw_pool; + sed->hw = dma_pool_alloc(device->sed_hw_pool[hw_pool], + flags, &sed->dma); + if (!sed->hw) { + kmem_cache_free(device->sed_pool, sed); + return NULL; + } + + return sed; +} + +static void ioat3_free_sed(struct ioatdma_device *device, struct ioat_sed_ent *sed) +{ + if (!sed) + return; + + dma_pool_free(device->sed_hw_pool[sed->hw_pool], sed->hw, sed->dma); + kmem_cache_free(device->sed_pool, sed); +} + static void ioat3_dma_unmap(struct ioat2_dma_chan *ioat, struct ioat_ring_ent *desc, int idx) { @@ -223,6 +408,54 @@ static void ioat3_dma_unmap(struct ioat2_dma_chan *ioat, } break; } + case IOAT_OP_PQ_16S: + case IOAT_OP_PQ_VAL_16S: { + struct ioat_pq_descriptor *pq = desc->pq; + int src_cnt = src16_cnt_to_sw(pq->ctl_f.src_cnt); + struct ioat_raw_descriptor *descs[4]; + int i; + + /* in the 'continue' case don't unmap the dests as sources */ + if (dmaf_p_disabled_continue(flags)) + src_cnt--; + else if (dmaf_continue(flags)) + src_cnt -= 3; + + if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) { + descs[0] = (struct ioat_raw_descriptor *)pq; + descs[1] = (struct ioat_raw_descriptor *)(desc->sed->hw); + descs[2] = (struct ioat_raw_descriptor *)(&desc->sed->hw->b[0]); + for (i = 0; i < src_cnt; i++) { + dma_addr_t src = pq16_get_src(descs, i); + + ioat_unmap(pdev, src - offset, len, + PCI_DMA_TODEVICE, flags, 0); + } + + /* the dests are sources in pq validate operations */ + if (pq->ctl_f.op == IOAT_OP_XOR_VAL) { + if (!(flags & DMA_PREP_PQ_DISABLE_P)) + ioat_unmap(pdev, pq->p_addr - offset, + len, PCI_DMA_TODEVICE, + flags, 0); + if (!(flags & DMA_PREP_PQ_DISABLE_Q)) + ioat_unmap(pdev, pq->q_addr - offset, + len, PCI_DMA_TODEVICE, + flags, 0); + break; + } + } + + if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) { + if (!(flags & DMA_PREP_PQ_DISABLE_P)) + ioat_unmap(pdev, pq->p_addr - offset, len, + PCI_DMA_BIDIRECTIONAL, flags, 1); + if (!(flags & DMA_PREP_PQ_DISABLE_Q)) + ioat_unmap(pdev, pq->q_addr - offset, len, + PCI_DMA_BIDIRECTIONAL, flags, 1); + } + break; + } default: dev_err(&pdev->dev, "%s: unknown op type: %#x\n", __func__, desc->hw->ctl_f.op); @@ -250,6 +483,63 @@ static bool desc_has_ext(struct ioat_ring_ent *desc) return false; } +static u64 ioat3_get_current_completion(struct ioat_chan_common *chan) +{ + u64 phys_complete; + u64 completion; + + completion = *chan->completion; + phys_complete = ioat_chansts_to_addr(completion); + + dev_dbg(to_dev(chan), "%s: phys_complete: %#llx\n", __func__, + (unsigned long long) phys_complete); + + return phys_complete; +} + +static bool ioat3_cleanup_preamble(struct ioat_chan_common *chan, + u64 *phys_complete) +{ + *phys_complete = ioat3_get_current_completion(chan); + if (*phys_complete == chan->last_completion) + return false; + + clear_bit(IOAT_COMPLETION_ACK, &chan->state); + mod_timer(&chan->timer, jiffies + COMPLETION_TIMEOUT); + + return true; +} + +static void +desc_get_errstat(struct ioat2_dma_chan *ioat, struct ioat_ring_ent *desc) +{ + struct ioat_dma_descriptor *hw = desc->hw; + + switch (hw->ctl_f.op) { + case IOAT_OP_PQ_VAL: + case IOAT_OP_PQ_VAL_16S: + { + struct ioat_pq_descriptor *pq = desc->pq; + + /* check if there's error written */ + if (!pq->dwbes_f.wbes) + return; + + /* need to set a chanerr var for checking to clear later */ + + if (pq->dwbes_f.p_val_err) + *desc->result |= SUM_CHECK_P_RESULT; + + if (pq->dwbes_f.q_val_err) + *desc->result |= SUM_CHECK_Q_RESULT; + + return; + } + default: + return; + } +} + /** * __cleanup - reclaim used descriptors * @ioat: channel (ring) to clean @@ -260,6 +550,7 @@ static bool desc_has_ext(struct ioat_ring_ent *desc) static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete) { struct ioat_chan_common *chan = &ioat->base; + struct ioatdma_device *device = chan->device; struct ioat_ring_ent *desc; bool seen_current = false; int idx = ioat->tail, i; @@ -268,6 +559,16 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete) dev_dbg(to_dev(chan), "%s: head: %#x tail: %#x issued: %#x\n", __func__, ioat->head, ioat->tail, ioat->issued); + /* + * At restart of the channel, the completion address and the + * channel status will be 0 due to starting a new chain. Since + * it's new chain and the first descriptor "fails", there is + * nothing to clean up. We do not want to reap the entire submitted + * chain due to this 0 address value and then BUG. + */ + if (!phys_complete) + return; + active = ioat2_ring_active(ioat); for (i = 0; i < active && !seen_current; i++) { struct dma_async_tx_descriptor *tx; @@ -276,6 +577,11 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete) prefetch(ioat2_get_ring_ent(ioat, idx + i + 1)); desc = ioat2_get_ring_ent(ioat, idx + i); dump_desc_dbg(ioat, desc); + + /* set err stat if we are using dwbes */ + if (device->cap & IOAT_CAP_DWBES) + desc_get_errstat(ioat, desc); + tx = &desc->txd; if (tx->cookie) { dma_cookie_complete(tx); @@ -294,6 +600,12 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete) BUG_ON(i + 1 >= active); i++; } + + /* cleanup super extended descriptors */ + if (desc->sed) { + ioat3_free_sed(device, desc->sed); + desc->sed = NULL; + } } smp_mb(); /* finish all descriptor reads before incrementing tail */ ioat->tail = idx + i; @@ -314,11 +626,22 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete) static void ioat3_cleanup(struct ioat2_dma_chan *ioat) { struct ioat_chan_common *chan = &ioat->base; - dma_addr_t phys_complete; + u64 phys_complete; spin_lock_bh(&chan->cleanup_lock); - if (ioat_cleanup_preamble(chan, &phys_complete)) + + if (ioat3_cleanup_preamble(chan, &phys_complete)) __cleanup(ioat, phys_complete); + + if (is_ioat_halted(*chan->completion)) { + u32 chanerr = readl(chan->reg_base + IOAT_CHANERR_OFFSET); + + if (chanerr & IOAT_CHANERR_HANDLE_MASK) { + mod_timer(&chan->timer, jiffies + IDLE_TIMEOUT); + ioat3_eh(ioat); + } + } + spin_unlock_bh(&chan->cleanup_lock); } @@ -333,15 +656,78 @@ static void ioat3_cleanup_event(unsigned long data) static void ioat3_restart_channel(struct ioat2_dma_chan *ioat) { struct ioat_chan_common *chan = &ioat->base; - dma_addr_t phys_complete; + u64 phys_complete; ioat2_quiesce(chan, 0); - if (ioat_cleanup_preamble(chan, &phys_complete)) + if (ioat3_cleanup_preamble(chan, &phys_complete)) __cleanup(ioat, phys_complete); __ioat2_restart_chan(ioat); } +static void ioat3_eh(struct ioat2_dma_chan *ioat) +{ + struct ioat_chan_common *chan = &ioat->base; + struct pci_dev *pdev = to_pdev(chan); + struct ioat_dma_descriptor *hw; + u64 phys_complete; + struct ioat_ring_ent *desc; + u32 err_handled = 0; + u32 chanerr_int; + u32 chanerr; + + /* cleanup so tail points to descriptor that caused the error */ + if (ioat3_cleanup_preamble(chan, &phys_complete)) + __cleanup(ioat, phys_complete); + + chanerr = readl(chan->reg_base + IOAT_CHANERR_OFFSET); + pci_read_config_dword(pdev, IOAT_PCI_CHANERR_INT_OFFSET, &chanerr_int); + + dev_dbg(to_dev(chan), "%s: error = %x:%x\n", + __func__, chanerr, chanerr_int); + + desc = ioat2_get_ring_ent(ioat, ioat->tail); + hw = desc->hw; + dump_desc_dbg(ioat, desc); + + switch (hw->ctl_f.op) { + case IOAT_OP_XOR_VAL: + if (chanerr & IOAT_CHANERR_XOR_P_OR_CRC_ERR) { + *desc->result |= SUM_CHECK_P_RESULT; + err_handled |= IOAT_CHANERR_XOR_P_OR_CRC_ERR; + } + break; + case IOAT_OP_PQ_VAL: + case IOAT_OP_PQ_VAL_16S: + if (chanerr & IOAT_CHANERR_XOR_P_OR_CRC_ERR) { + *desc->result |= SUM_CHECK_P_RESULT; + err_handled |= IOAT_CHANERR_XOR_P_OR_CRC_ERR; + } + if (chanerr & IOAT_CHANERR_XOR_Q_ERR) { + *desc->result |= SUM_CHECK_Q_RESULT; + err_handled |= IOAT_CHANERR_XOR_Q_ERR; + } + break; + } + + /* fault on unhandled error or spurious halt */ + if (chanerr ^ err_handled || chanerr == 0) { + dev_err(to_dev(chan), "%s: fatal error (%x:%x)\n", + __func__, chanerr, err_handled); + BUG(); + } + + writel(chanerr, chan->reg_base + IOAT_CHANERR_OFFSET); + pci_write_config_dword(pdev, IOAT_PCI_CHANERR_INT_OFFSET, chanerr_int); + + /* mark faulting descriptor as complete */ + *chan->completion = desc->txd.phys; + + spin_lock_bh(&ioat->prep_lock); + ioat3_restart_channel(ioat); + spin_unlock_bh(&ioat->prep_lock); +} + static void check_active(struct ioat2_dma_chan *ioat) { struct ioat_chan_common *chan = &ioat->base; @@ -605,7 +991,8 @@ dump_pq_desc_dbg(struct ioat2_dma_chan *ioat, struct ioat_ring_ent *desc, struct int i; dev_dbg(dev, "desc[%d]: (%#llx->%#llx) flags: %#x" - " sz: %#x ctl: %#x (op: %d int: %d compl: %d pq: '%s%s' src_cnt: %d)\n", + " sz: %#10.8x ctl: %#x (op: %#x int: %d compl: %d pq: '%s%s'" + " src_cnt: %d)\n", desc_id(desc), (unsigned long long) desc->txd.phys, (unsigned long long) (pq_ex ? pq_ex->next : pq->next), desc->txd.flags, pq->size, pq->ctl, pq->ctl_f.op, pq->ctl_f.int_en, @@ -617,6 +1004,42 @@ dump_pq_desc_dbg(struct ioat2_dma_chan *ioat, struct ioat_ring_ent *desc, struct (unsigned long long) pq_get_src(descs, i), pq->coef[i]); dev_dbg(dev, "\tP: %#llx\n", pq->p_addr); dev_dbg(dev, "\tQ: %#llx\n", pq->q_addr); + dev_dbg(dev, "\tNEXT: %#llx\n", pq->next); +} + +static void dump_pq16_desc_dbg(struct ioat2_dma_chan *ioat, + struct ioat_ring_ent *desc) +{ + struct device *dev = to_dev(&ioat->base); + struct ioat_pq_descriptor *pq = desc->pq; + struct ioat_raw_descriptor *descs[] = { (void *)pq, + (void *)pq, + (void *)pq }; + int src_cnt = src16_cnt_to_sw(pq->ctl_f.src_cnt); + int i; + + if (desc->sed) { + descs[1] = (void *)desc->sed->hw; + descs[2] = (void *)desc->sed->hw + 64; + } + + dev_dbg(dev, "desc[%d]: (%#llx->%#llx) flags: %#x" + " sz: %#x ctl: %#x (op: %#x int: %d compl: %d pq: '%s%s'" + " src_cnt: %d)\n", + desc_id(desc), (unsigned long long) desc->txd.phys, + (unsigned long long) pq->next, + desc->txd.flags, pq->size, pq->ctl, + pq->ctl_f.op, pq->ctl_f.int_en, + pq->ctl_f.compl_write, + pq->ctl_f.p_disable ? "" : "p", pq->ctl_f.q_disable ? "" : "q", + pq->ctl_f.src_cnt); + for (i = 0; i < src_cnt; i++) { + dev_dbg(dev, "\tsrc[%d]: %#llx coef: %#x\n", i, + (unsigned long long) pq16_get_src(descs, i), + pq->coef[i]); + } + dev_dbg(dev, "\tP: %#llx\n", pq->p_addr); + dev_dbg(dev, "\tQ: %#llx\n", pq->q_addr); } static struct dma_async_tx_descriptor * @@ -627,6 +1050,7 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result, { struct ioat2_dma_chan *ioat = to_ioat2_chan(c); struct ioat_chan_common *chan = &ioat->base; + struct ioatdma_device *device = chan->device; struct ioat_ring_ent *compl_desc; struct ioat_ring_ent *desc; struct ioat_ring_ent *ext; @@ -637,6 +1061,7 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result, u32 offset = 0; u8 op = result ? IOAT_OP_PQ_VAL : IOAT_OP_PQ; int i, s, idx, with_ext, num_descs; + int cb32 = (device->version < IOAT_VER_3_3) ? 1 : 0; dev_dbg(to_dev(chan), "%s\n", __func__); /* the engine requires at least two sources (we provide @@ -662,7 +1087,7 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result, * order. */ if (likely(num_descs) && - ioat2_check_space_lock(ioat, num_descs+1) == 0) + ioat2_check_space_lock(ioat, num_descs + cb32) == 0) idx = ioat->head; else return NULL; @@ -700,6 +1125,9 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result, pq->q_addr = dst[1] + offset; pq->ctl = 0; pq->ctl_f.op = op; + /* we turn on descriptor write back error status */ + if (device->cap & IOAT_CAP_DWBES) + pq->ctl_f.wb_en = result ? 1 : 0; pq->ctl_f.src_cnt = src_cnt_to_hw(s); pq->ctl_f.p_disable = !!(flags & DMA_PREP_PQ_DISABLE_P); pq->ctl_f.q_disable = !!(flags & DMA_PREP_PQ_DISABLE_Q); @@ -716,26 +1144,140 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result, pq->ctl_f.fence = !!(flags & DMA_PREP_FENCE); dump_pq_desc_dbg(ioat, desc, ext); - /* completion descriptor carries interrupt bit */ - compl_desc = ioat2_get_ring_ent(ioat, idx + i); - compl_desc->txd.flags = flags & DMA_PREP_INTERRUPT; - hw = compl_desc->hw; - hw->ctl = 0; - hw->ctl_f.null = 1; - hw->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT); - hw->ctl_f.compl_write = 1; - hw->size = NULL_DESC_BUFFER_SIZE; - dump_desc_dbg(ioat, compl_desc); + if (!cb32) { + pq->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT); + pq->ctl_f.compl_write = 1; + compl_desc = desc; + } else { + /* completion descriptor carries interrupt bit */ + compl_desc = ioat2_get_ring_ent(ioat, idx + i); + compl_desc->txd.flags = flags & DMA_PREP_INTERRUPT; + hw = compl_desc->hw; + hw->ctl = 0; + hw->ctl_f.null = 1; + hw->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT); + hw->ctl_f.compl_write = 1; + hw->size = NULL_DESC_BUFFER_SIZE; + dump_desc_dbg(ioat, compl_desc); + } + /* we leave the channel locked to ensure in order submission */ return &compl_desc->txd; } +static struct dma_async_tx_descriptor * +__ioat3_prep_pq16_lock(struct dma_chan *c, enum sum_check_flags *result, + const dma_addr_t *dst, const dma_addr_t *src, + unsigned int src_cnt, const unsigned char *scf, + size_t len, unsigned long flags) +{ + struct ioat2_dma_chan *ioat = to_ioat2_chan(c); + struct ioat_chan_common *chan = &ioat->base; + struct ioatdma_device *device = chan->device; + struct ioat_ring_ent *desc; + size_t total_len = len; + struct ioat_pq_descriptor *pq; + u32 offset = 0; + u8 op; + int i, s, idx, num_descs; + + /* this function only handles src_cnt 9 - 16 */ + BUG_ON(src_cnt < 9); + + /* this function is only called with 9-16 sources */ + op = result ? IOAT_OP_PQ_VAL_16S : IOAT_OP_PQ_16S; + + dev_dbg(to_dev(chan), "%s\n", __func__); + + num_descs = ioat2_xferlen_to_descs(ioat, len); + + /* + * 16 source pq is only available on cb3.3 and has no completion + * write hw bug. + */ + if (num_descs && ioat2_check_space_lock(ioat, num_descs) == 0) + idx = ioat->head; + else + return NULL; + + i = 0; + + do { + struct ioat_raw_descriptor *descs[4]; + size_t xfer_size = min_t(size_t, len, 1 << ioat->xfercap_log); + + desc = ioat2_get_ring_ent(ioat, idx + i); + pq = desc->pq; + + descs[0] = (struct ioat_raw_descriptor *) pq; + + desc->sed = ioat3_alloc_sed(device, + sed_get_pq16_pool_idx(src_cnt)); + if (!desc->sed) { + dev_err(to_dev(chan), + "%s: no free sed entries\n", __func__); + return NULL; + } + + pq->sed_addr = desc->sed->dma; + desc->sed->parent = desc; + + descs[1] = (struct ioat_raw_descriptor *)desc->sed->hw; + descs[2] = (void *)descs[1] + 64; + + for (s = 0; s < src_cnt; s++) + pq16_set_src(descs, src[s], offset, scf[s], s); + + /* see the comment for dma_maxpq in include/linux/dmaengine.h */ + if (dmaf_p_disabled_continue(flags)) + pq16_set_src(descs, dst[1], offset, 1, s++); + else if (dmaf_continue(flags)) { + pq16_set_src(descs, dst[0], offset, 0, s++); + pq16_set_src(descs, dst[1], offset, 1, s++); + pq16_set_src(descs, dst[1], offset, 0, s++); + } + + pq->size = xfer_size; + pq->p_addr = dst[0] + offset; + pq->q_addr = dst[1] + offset; + pq->ctl = 0; + pq->ctl_f.op = op; + pq->ctl_f.src_cnt = src16_cnt_to_hw(s); + /* we turn on descriptor write back error status */ + if (device->cap & IOAT_CAP_DWBES) + pq->ctl_f.wb_en = result ? 1 : 0; + pq->ctl_f.p_disable = !!(flags & DMA_PREP_PQ_DISABLE_P); + pq->ctl_f.q_disable = !!(flags & DMA_PREP_PQ_DISABLE_Q); + + len -= xfer_size; + offset += xfer_size; + } while (++i < num_descs); + + /* last pq descriptor carries the unmap parameters and fence bit */ + desc->txd.flags = flags; + desc->len = total_len; + if (result) + desc->result = result; + pq->ctl_f.fence = !!(flags & DMA_PREP_FENCE); + + /* with cb3.3 we should be able to do completion w/o a null desc */ + pq->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT); + pq->ctl_f.compl_write = 1; + + dump_pq16_desc_dbg(ioat, desc); + + /* we leave the channel locked to ensure in order submission */ + return &desc->txd; +} + static struct dma_async_tx_descriptor * ioat3_prep_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src, unsigned int src_cnt, const unsigned char *scf, size_t len, unsigned long flags) { + struct dma_device *dma = chan->device; + /* specify valid address for disabled result */ if (flags & DMA_PREP_PQ_DISABLE_P) dst[0] = dst[1]; @@ -755,11 +1297,20 @@ ioat3_prep_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src, single_source_coef[0] = scf[0]; single_source_coef[1] = 0; - return __ioat3_prep_pq_lock(chan, NULL, dst, single_source, 2, - single_source_coef, len, flags); - } else - return __ioat3_prep_pq_lock(chan, NULL, dst, src, src_cnt, scf, - len, flags); + return (src_cnt > 8) && (dma->max_pq > 8) ? + __ioat3_prep_pq16_lock(chan, NULL, dst, single_source, + 2, single_source_coef, len, + flags) : + __ioat3_prep_pq_lock(chan, NULL, dst, single_source, 2, + single_source_coef, len, flags); + + } else { + return (src_cnt > 8) && (dma->max_pq > 8) ? + __ioat3_prep_pq16_lock(chan, NULL, dst, src, src_cnt, + scf, len, flags) : + __ioat3_prep_pq_lock(chan, NULL, dst, src, src_cnt, + scf, len, flags); + } } struct dma_async_tx_descriptor * @@ -767,6 +1318,8 @@ ioat3_prep_pq_val(struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src, unsigned int src_cnt, const unsigned char *scf, size_t len, enum sum_check_flags *pqres, unsigned long flags) { + struct dma_device *dma = chan->device; + /* specify valid address for disabled result */ if (flags & DMA_PREP_PQ_DISABLE_P) pq[0] = pq[1]; @@ -778,14 +1331,18 @@ ioat3_prep_pq_val(struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src, */ *pqres = 0; - return __ioat3_prep_pq_lock(chan, pqres, pq, src, src_cnt, scf, len, - flags); + return (src_cnt > 8) && (dma->max_pq > 8) ? + __ioat3_prep_pq16_lock(chan, pqres, pq, src, src_cnt, scf, len, + flags) : + __ioat3_prep_pq_lock(chan, pqres, pq, src, src_cnt, scf, len, + flags); } static struct dma_async_tx_descriptor * ioat3_prep_pqxor(struct dma_chan *chan, dma_addr_t dst, dma_addr_t *src, unsigned int src_cnt, size_t len, unsigned long flags) { + struct dma_device *dma = chan->device; unsigned char scf[src_cnt]; dma_addr_t pq[2]; @@ -794,8 +1351,11 @@ ioat3_prep_pqxor(struct dma_chan *chan, dma_addr_t dst, dma_addr_t *src, flags |= DMA_PREP_PQ_DISABLE_Q; pq[1] = dst; /* specify valid address for disabled result */ - return __ioat3_prep_pq_lock(chan, NULL, pq, src, src_cnt, scf, len, - flags); + return (src_cnt > 8) && (dma->max_pq > 8) ? + __ioat3_prep_pq16_lock(chan, NULL, pq, src, src_cnt, scf, len, + flags) : + __ioat3_prep_pq_lock(chan, NULL, pq, src, src_cnt, scf, len, + flags); } struct dma_async_tx_descriptor * @@ -803,6 +1363,7 @@ ioat3_prep_pqxor_val(struct dma_chan *chan, dma_addr_t *src, unsigned int src_cnt, size_t len, enum sum_check_flags *result, unsigned long flags) { + struct dma_device *dma = chan->device; unsigned char scf[src_cnt]; dma_addr_t pq[2]; @@ -816,8 +1377,12 @@ ioat3_prep_pqxor_val(struct dma_chan *chan, dma_addr_t *src, flags |= DMA_PREP_PQ_DISABLE_Q; pq[1] = pq[0]; /* specify valid address for disabled result */ - return __ioat3_prep_pq_lock(chan, result, pq, &src[1], src_cnt - 1, scf, - len, flags); + + return (src_cnt > 8) && (dma->max_pq > 8) ? + __ioat3_prep_pq16_lock(chan, result, pq, &src[1], src_cnt - 1, + scf, len, flags) : + __ioat3_prep_pq_lock(chan, result, pq, &src[1], src_cnt - 1, + scf, len, flags); } static struct dma_async_tx_descriptor * @@ -1167,6 +1732,56 @@ static int ioat3_dma_self_test(struct ioatdma_device *device) return 0; } +static int ioat3_irq_reinit(struct ioatdma_device *device) +{ + int msixcnt = device->common.chancnt; + struct pci_dev *pdev = device->pdev; + int i; + struct msix_entry *msix; + struct ioat_chan_common *chan; + int err = 0; + + switch (device->irq_mode) { + case IOAT_MSIX: + + for (i = 0; i < msixcnt; i++) { + msix = &device->msix_entries[i]; + chan = ioat_chan_by_index(device, i); + devm_free_irq(&pdev->dev, msix->vector, chan); + } + + pci_disable_msix(pdev); + break; + + case IOAT_MSIX_SINGLE: + msix = &device->msix_entries[0]; + chan = ioat_chan_by_index(device, 0); + devm_free_irq(&pdev->dev, msix->vector, chan); + pci_disable_msix(pdev); + break; + + case IOAT_MSI: + chan = ioat_chan_by_index(device, 0); + devm_free_irq(&pdev->dev, pdev->irq, chan); + pci_disable_msi(pdev); + break; + + case IOAT_INTX: + chan = ioat_chan_by_index(device, 0); + devm_free_irq(&pdev->dev, pdev->irq, chan); + break; + + default: + return 0; + } + + device->irq_mode = IOAT_NOIRQ; + + err = ioat_dma_setup_interrupts(device); + + return err; +} + static int ioat3_reset_hw(struct ioat_chan_common *chan) { /* throw away whatever the channel was doing and get it @@ -1183,80 +1798,65 @@ static int ioat3_reset_hw(struct ioat_chan_common *chan) chanerr = readl(chan->reg_base + IOAT_CHANERR_OFFSET); writel(chanerr, chan->reg_base + IOAT_CHANERR_OFFSET); - /* clear any pending errors */ - err = pci_read_config_dword(pdev, IOAT_PCI_CHANERR_INT_OFFSET, &chanerr); + if (device->version < IOAT_VER_3_3) { + /* clear any pending errors */ + err = pci_read_config_dword(pdev, + IOAT_PCI_CHANERR_INT_OFFSET, &chanerr); + if (err) { + dev_err(&pdev->dev, + "channel error register unreachable\n"); + return err; + } + pci_write_config_dword(pdev, + IOAT_PCI_CHANERR_INT_OFFSET, chanerr); + + /* Clear DMAUNCERRSTS Cfg-Reg Parity Error status bit + * (workaround for spurious config parity error after restart) + */ + pci_read_config_word(pdev, IOAT_PCI_DEVICE_ID_OFFSET, &dev_id); + if (dev_id == PCI_DEVICE_ID_INTEL_IOAT_TBG0) { + pci_write_config_dword(pdev, + IOAT_PCI_DMAUNCERRSTS_OFFSET, + 0x10); + } + } + + err = ioat2_reset_sync(chan, msecs_to_jiffies(200)); if (err) { - dev_err(&pdev->dev, "channel error register unreachable\n"); + dev_err(&pdev->dev, "Failed to reset!\n"); return err; } - pci_write_config_dword(pdev, IOAT_PCI_CHANERR_INT_OFFSET, chanerr); - /* Clear DMAUNCERRSTS Cfg-Reg Parity Error status bit - * (workaround for spurious config parity error after restart) - */ - pci_read_config_word(pdev, IOAT_PCI_DEVICE_ID_OFFSET, &dev_id); - if (dev_id == PCI_DEVICE_ID_INTEL_IOAT_TBG0) - pci_write_config_dword(pdev, IOAT_PCI_DMAUNCERRSTS_OFFSET, 0x10); + if (device->irq_mode != IOAT_NOIRQ && is_bwd_ioat(pdev)) + err = ioat3_irq_reinit(device); - return ioat2_reset_sync(chan, msecs_to_jiffies(200)); + return err; } -static bool is_jf_ioat(struct pci_dev *pdev) +static void ioat3_intr_quirk(struct ioatdma_device *device) { - switch (pdev->device) { - case PCI_DEVICE_ID_INTEL_IOAT_JSF0: - case PCI_DEVICE_ID_INTEL_IOAT_JSF1: - case PCI_DEVICE_ID_INTEL_IOAT_JSF2: - case PCI_DEVICE_ID_INTEL_IOAT_JSF3: - case PCI_DEVICE_ID_INTEL_IOAT_JSF4: - case PCI_DEVICE_ID_INTEL_IOAT_JSF5: - case PCI_DEVICE_ID_INTEL_IOAT_JSF6: - case PCI_DEVICE_ID_INTEL_IOAT_JSF7: - case PCI_DEVICE_ID_INTEL_IOAT_JSF8: - case PCI_DEVICE_ID_INTEL_IOAT_JSF9: - return true; - default: - return false; - } -} + struct dma_device *dma; + struct dma_chan *c; + struct ioat_chan_common *chan; + u32 errmask; -static bool is_snb_ioat(struct pci_dev *pdev) -{ - switch (pdev->device) { - case PCI_DEVICE_ID_INTEL_IOAT_SNB0: - case PCI_DEVICE_ID_INTEL_IOAT_SNB1: - case PCI_DEVICE_ID_INTEL_IOAT_SNB2: - case PCI_DEVICE_ID_INTEL_IOAT_SNB3: - case PCI_DEVICE_ID_INTEL_IOAT_SNB4: - case PCI_DEVICE_ID_INTEL_IOAT_SNB5: - case PCI_DEVICE_ID_INTEL_IOAT_SNB6: - case PCI_DEVICE_ID_INTEL_IOAT_SNB7: - case PCI_DEVICE_ID_INTEL_IOAT_SNB8: - case PCI_DEVICE_ID_INTEL_IOAT_SNB9: - return true; - default: - return false; - } -} + dma = &device->common; -static bool is_ivb_ioat(struct pci_dev *pdev) -{ - switch (pdev->device) { - case PCI_DEVICE_ID_INTEL_IOAT_IVB0: - case PCI_DEVICE_ID_INTEL_IOAT_IVB1: - case PCI_DEVICE_ID_INTEL_IOAT_IVB2: - case PCI_DEVICE_ID_INTEL_IOAT_IVB3: - case PCI_DEVICE_ID_INTEL_IOAT_IVB4: - case PCI_DEVICE_ID_INTEL_IOAT_IVB5: - case PCI_DEVICE_ID_INTEL_IOAT_IVB6: - case PCI_DEVICE_ID_INTEL_IOAT_IVB7: - case PCI_DEVICE_ID_INTEL_IOAT_IVB8: - case PCI_DEVICE_ID_INTEL_IOAT_IVB9: - return true; - default: - return false; + /* + * if we have descriptor write back error status, we mask the + * error interrupts + */ + if (device->cap & IOAT_CAP_DWBES) { + list_for_each_entry(c, &dma->channels, device_node) { + chan = to_chan_common(c); + errmask = readl(chan->reg_base + + IOAT_CHANERR_MASK_OFFSET); + errmask |= IOAT_CHANERR_XOR_P_OR_CRC_ERR | + IOAT_CHANERR_XOR_Q_ERR; + writel(errmask, chan->reg_base + + IOAT_CHANERR_MASK_OFFSET); + } } - } int ioat3_dma_probe(struct ioatdma_device *device, int dca) @@ -1268,30 +1868,33 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca) struct ioat_chan_common *chan; bool is_raid_device = false; int err; - u32 cap; device->enumerate_channels = ioat2_enumerate_channels; device->reset_hw = ioat3_reset_hw; device->self_test = ioat3_dma_self_test; + device->intr_quirk = ioat3_intr_quirk; dma = &device->common; dma->device_prep_dma_memcpy = ioat2_dma_prep_memcpy_lock; dma->device_issue_pending = ioat2_issue_pending; dma->device_alloc_chan_resources = ioat2_alloc_chan_resources; dma->device_free_chan_resources = ioat2_free_chan_resources; - if (is_jf_ioat(pdev) || is_snb_ioat(pdev) || is_ivb_ioat(pdev)) + if (is_xeon_cb32(pdev)) dma->copy_align = 6; dma_cap_set(DMA_INTERRUPT, dma->cap_mask); dma->device_prep_dma_interrupt = ioat3_prep_interrupt_lock; - cap = readl(device->reg_base + IOAT_DMA_CAP_OFFSET); + device->cap = readl(device->reg_base + IOAT_DMA_CAP_OFFSET); + + if (is_bwd_noraid(pdev)) + device->cap &= ~(IOAT_CAP_XOR | IOAT_CAP_PQ | IOAT_CAP_RAID16SS); /* dca is incompatible with raid operations */ - if (dca_en && (cap & (IOAT_CAP_XOR|IOAT_CAP_PQ))) - cap &= ~(IOAT_CAP_XOR|IOAT_CAP_PQ); + if (dca_en && (device->cap & (IOAT_CAP_XOR|IOAT_CAP_PQ))) + device->cap &= ~(IOAT_CAP_XOR|IOAT_CAP_PQ); - if (cap & IOAT_CAP_XOR) { + if (device->cap & IOAT_CAP_XOR) { is_raid_device = true; dma->max_xor = 8; dma->xor_align = 6; @@ -1302,53 +1905,86 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca) dma_cap_set(DMA_XOR_VAL, dma->cap_mask); dma->device_prep_dma_xor_val = ioat3_prep_xor_val; } - if (cap & IOAT_CAP_PQ) { + + if (device->cap & IOAT_CAP_PQ) { is_raid_device = true; - dma_set_maxpq(dma, 8, 0); - dma->pq_align = 6; - dma_cap_set(DMA_PQ, dma->cap_mask); dma->device_prep_dma_pq = ioat3_prep_pq; - - dma_cap_set(DMA_PQ_VAL, dma->cap_mask); dma->device_prep_dma_pq_val = ioat3_prep_pq_val; + dma_cap_set(DMA_PQ, dma->cap_mask); + dma_cap_set(DMA_PQ_VAL, dma->cap_mask); - if (!(cap & IOAT_CAP_XOR)) { - dma->max_xor = 8; - dma->xor_align = 6; + if (device->cap & IOAT_CAP_RAID16SS) { + dma_set_maxpq(dma, 16, 0); + dma->pq_align = 0; + } else { + dma_set_maxpq(dma, 8, 0); + if (is_xeon_cb32(pdev)) + dma->pq_align = 6; + else + dma->pq_align = 0; + } - dma_cap_set(DMA_XOR, dma->cap_mask); + if (!(device->cap & IOAT_CAP_XOR)) { dma->device_prep_dma_xor = ioat3_prep_pqxor; - - dma_cap_set(DMA_XOR_VAL, dma->cap_mask); dma->device_prep_dma_xor_val = ioat3_prep_pqxor_val; + dma_cap_set(DMA_XOR, dma->cap_mask); + dma_cap_set(DMA_XOR_VAL, dma->cap_mask); + + if (device->cap & IOAT_CAP_RAID16SS) { + dma->max_xor = 16; + dma->xor_align = 0; + } else { + dma->max_xor = 8; + if (is_xeon_cb32(pdev)) + dma->xor_align = 6; + else + dma->xor_align = 0; + } } } - if (is_raid_device && (cap & IOAT_CAP_FILL_BLOCK)) { + + if (is_raid_device && (device->cap & IOAT_CAP_FILL_BLOCK)) { dma_cap_set(DMA_MEMSET, dma->cap_mask); dma->device_prep_dma_memset = ioat3_prep_memset_lock; } - if (is_raid_device) { - dma->device_tx_status = ioat3_tx_status; - device->cleanup_fn = ioat3_cleanup_event; - device->timer_fn = ioat3_timer_event; - } else { - dma->device_tx_status = ioat_dma_tx_status; - device->cleanup_fn = ioat2_cleanup_event; - device->timer_fn = ioat2_timer_event; + dma->device_tx_status = ioat3_tx_status; + device->cleanup_fn = ioat3_cleanup_event; + device->timer_fn = ioat3_timer_event; + + if (is_xeon_cb32(pdev)) { + dma_cap_clear(DMA_XOR_VAL, dma->cap_mask); + dma->device_prep_dma_xor_val = NULL; + + dma_cap_clear(DMA_PQ_VAL, dma->cap_mask); + dma->device_prep_dma_pq_val = NULL; } - #ifdef CONFIG_ASYNC_TX_DISABLE_PQ_VAL_DMA - dma_cap_clear(DMA_PQ_VAL, dma->cap_mask); - dma->device_prep_dma_pq_val = NULL; - #endif + /* starting with CB3.3 super extended descriptors are supported */ + if (device->cap & IOAT_CAP_RAID16SS) { + char pool_name[14]; + int i; + + /* allocate sw descriptor pool for SED */ + device->sed_pool = kmem_cache_create("ioat_sed", + sizeof(struct ioat_sed_ent), 0, 0, NULL); + if (!device->sed_pool) + return -ENOMEM; + + for (i = 0; i < MAX_SED_POOLS; i++) { + snprintf(pool_name, 14, "ioat_hw%d_sed", i); - #ifdef CONFIG_ASYNC_TX_DISABLE_XOR_VAL_DMA - dma_cap_clear(DMA_XOR_VAL, dma->cap_mask); - dma->device_prep_dma_xor_val = NULL; - #endif + /* allocate SED DMA pool */ + device->sed_hw_pool[i] = dma_pool_create(pool_name, + &pdev->dev, + SED_SIZE * (i + 1), 64, 0); + if (!device->sed_hw_pool[i]) + goto sed_pool_cleanup; + + } + } err = ioat_probe(device); if (err) @@ -1371,4 +2007,28 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca) device->dca = ioat3_dca_init(pdev, device->reg_base); return 0; + +sed_pool_cleanup: + if (device->sed_pool) { + int i; + kmem_cache_destroy(device->sed_pool); + + for (i = 0; i < MAX_SED_POOLS; i++) + if (device->sed_hw_pool[i]) + dma_pool_destroy(device->sed_hw_pool[i]); + } + + return -ENOMEM; +} + +void ioat3_dma_remove(struct ioatdma_device *device) +{ + if (device->sed_pool) { + int i; + kmem_cache_destroy(device->sed_pool); + + for (i = 0; i < MAX_SED_POOLS; i++) + if (device->sed_hw_pool[i]) + dma_pool_destroy(device->sed_hw_pool[i]); + } } diff --git a/drivers/dma/ioat/hw.h b/drivers/dma/ioat/hw.h index 7cb74c62c719..5ee57d402a6e 100644 --- a/drivers/dma/ioat/hw.h +++ b/drivers/dma/ioat/hw.h @@ -30,11 +30,6 @@ #define IOAT_PCI_DID_SCNB 0x65FF #define IOAT_PCI_DID_SNB 0x402F -#define IOAT_VER_1_2 0x12 /* Version 1.2 */ -#define IOAT_VER_2_0 0x20 /* Version 2.0 */ -#define IOAT_VER_3_0 0x30 /* Version 3.0 */ -#define IOAT_VER_3_2 0x32 /* Version 3.2 */ - #define PCI_DEVICE_ID_INTEL_IOAT_IVB0 0x0e20 #define PCI_DEVICE_ID_INTEL_IOAT_IVB1 0x0e21 #define PCI_DEVICE_ID_INTEL_IOAT_IVB2 0x0e22 @@ -46,6 +41,29 @@ #define PCI_DEVICE_ID_INTEL_IOAT_IVB8 0x0e2e #define PCI_DEVICE_ID_INTEL_IOAT_IVB9 0x0e2f +#define PCI_DEVICE_ID_INTEL_IOAT_HSW0 0x2f20 +#define PCI_DEVICE_ID_INTEL_IOAT_HSW1 0x2f21 +#define PCI_DEVICE_ID_INTEL_IOAT_HSW2 0x2f22 +#define PCI_DEVICE_ID_INTEL_IOAT_HSW3 0x2f23 +#define PCI_DEVICE_ID_INTEL_IOAT_HSW4 0x2f24 +#define PCI_DEVICE_ID_INTEL_IOAT_HSW5 0x2f25 +#define PCI_DEVICE_ID_INTEL_IOAT_HSW6 0x2f26 +#define PCI_DEVICE_ID_INTEL_IOAT_HSW7 0x2f27 +#define PCI_DEVICE_ID_INTEL_IOAT_HSW8 0x2f2e +#define PCI_DEVICE_ID_INTEL_IOAT_HSW9 0x2f2f + +#define PCI_DEVICE_ID_INTEL_IOAT_BWD0 0x0C50 +#define PCI_DEVICE_ID_INTEL_IOAT_BWD1 0x0C51 +#define PCI_DEVICE_ID_INTEL_IOAT_BWD2 0x0C52 +#define PCI_DEVICE_ID_INTEL_IOAT_BWD3 0x0C53 + +#define IOAT_VER_1_2 0x12 /* Version 1.2 */ +#define IOAT_VER_2_0 0x20 /* Version 2.0 */ +#define IOAT_VER_3_0 0x30 /* Version 3.0 */ +#define IOAT_VER_3_2 0x32 /* Version 3.2 */ +#define IOAT_VER_3_3 0x33 /* Version 3.3 */ + + int system_has_dca_enabled(struct pci_dev *pdev); struct ioat_dma_descriptor { @@ -147,7 +165,17 @@ struct ioat_xor_ext_descriptor { }; struct ioat_pq_descriptor { - uint32_t size; + union { + uint32_t size; + uint32_t dwbes; + struct { + unsigned int rsvd:25; + unsigned int p_val_err:1; + unsigned int q_val_err:1; + unsigned int rsvd1:4; + unsigned int wbes:1; + } dwbes_f; + }; union { uint32_t ctl; struct { @@ -162,9 +190,14 @@ struct ioat_pq_descriptor { unsigned int hint:1; unsigned int p_disable:1; unsigned int q_disable:1; - unsigned int rsvd:11; + unsigned int rsvd2:2; + unsigned int wb_en:1; + unsigned int prl_en:1; + unsigned int rsvd3:7; #define IOAT_OP_PQ 0x89 #define IOAT_OP_PQ_VAL 0x8a + #define IOAT_OP_PQ_16S 0xa0 + #define IOAT_OP_PQ_VAL_16S 0xa1 unsigned int op:8; } ctl_f; }; @@ -172,7 +205,10 @@ struct ioat_pq_descriptor { uint64_t p_addr; uint64_t next; uint64_t src_addr2; - uint64_t src_addr3; + union { + uint64_t src_addr3; + uint64_t sed_addr; + }; uint8_t coef[8]; uint64_t q_addr; }; @@ -221,4 +257,40 @@ struct ioat_pq_update_descriptor { struct ioat_raw_descriptor { uint64_t field[8]; }; + +struct ioat_pq16a_descriptor { + uint8_t coef[8]; + uint64_t src_addr3; + uint64_t src_addr4; + uint64_t src_addr5; + uint64_t src_addr6; + uint64_t src_addr7; + uint64_t src_addr8; + uint64_t src_addr9; +}; + +struct ioat_pq16b_descriptor { + uint64_t src_addr10; + uint64_t src_addr11; + uint64_t src_addr12; + uint64_t src_addr13; + uint64_t src_addr14; + uint64_t src_addr15; + uint64_t src_addr16; + uint64_t rsvd; +}; + +union ioat_sed_pq_descriptor { + struct ioat_pq16a_descriptor a; + struct ioat_pq16b_descriptor b; +}; + +#define SED_SIZE 64 + +struct ioat_sed_raw_descriptor { + uint64_t a[8]; + uint64_t b[8]; + uint64_t c[8]; +}; + #endif diff --git a/drivers/dma/ioat/pci.c b/drivers/dma/ioat/pci.c index 71c7ecd80fac..2c8d560e6334 100644 --- a/drivers/dma/ioat/pci.c +++ b/drivers/dma/ioat/pci.c @@ -94,6 +94,23 @@ static struct pci_device_id ioat_pci_tbl[] = { { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB8) }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB9) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW0) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW1) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW2) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW3) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW4) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW5) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW6) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW7) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW8) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_HSW9) }, + + /* I/OAT v3.3 platforms */ + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_BWD0) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_BWD1) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_BWD2) }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_BWD3) }, + { 0, } }; MODULE_DEVICE_TABLE(pci, ioat_pci_tbl); @@ -190,6 +207,9 @@ static void ioat_remove(struct pci_dev *pdev) if (!device) return; + if (device->version >= IOAT_VER_3_0) + ioat3_dma_remove(device); + dev_err(&pdev->dev, "Removing dma and dca services\n"); if (device->dca) { unregister_dca_provider(device->dca, &pdev->dev); diff --git a/drivers/dma/ioat/registers.h b/drivers/dma/ioat/registers.h index 1391798542b6..2f1cfa0f1f47 100644 --- a/drivers/dma/ioat/registers.h +++ b/drivers/dma/ioat/registers.h @@ -79,6 +79,8 @@ #define IOAT_CAP_APIC 0x00000080 #define IOAT_CAP_XOR 0x00000100 #define IOAT_CAP_PQ 0x00000200 +#define IOAT_CAP_DWBES 0x00002000 +#define IOAT_CAP_RAID16SS 0x00020000 #define IOAT_CHANNEL_MMIO_SIZE 0x80 /* Each Channel MMIO space is this size */ @@ -93,6 +95,8 @@ #define IOAT_CHANCTRL_ERR_COMPLETION_EN 0x0004 #define IOAT_CHANCTRL_INT_REARM 0x0001 #define IOAT_CHANCTRL_RUN (IOAT_CHANCTRL_INT_REARM |\ + IOAT_CHANCTRL_ERR_INT_EN |\ + IOAT_CHANCTRL_ERR_COMPLETION_EN |\ IOAT_CHANCTRL_ANY_ERR_ABORT_EN) #define IOAT_DMA_COMP_OFFSET 0x02 /* 16-bit DMA channel compatibility */ diff --git a/drivers/dma/ipu/ipu_idmac.c b/drivers/dma/ipu/ipu_idmac.c index 8c61d17a86bf..d39c2cd0795d 100644 --- a/drivers/dma/ipu/ipu_idmac.c +++ b/drivers/dma/ipu/ipu_idmac.c @@ -1642,7 +1642,7 @@ static int __init ipu_idmac_init(struct ipu *ipu) return dma_async_device_register(&idmac->dma); } -static void __exit ipu_idmac_exit(struct ipu *ipu) +static void ipu_idmac_exit(struct ipu *ipu) { int i; struct idmac *idmac = &ipu->idmac; @@ -1756,7 +1756,7 @@ err_noirq: return ret; } -static int __exit ipu_remove(struct platform_device *pdev) +static int ipu_remove(struct platform_device *pdev) { struct ipu *ipu = platform_get_drvdata(pdev); @@ -1781,7 +1781,7 @@ static struct platform_driver ipu_platform_driver = { .name = "ipu-core", .owner = THIS_MODULE, }, - .remove = __exit_p(ipu_remove), + .remove = ipu_remove, }; static int __init ipu_init(void) diff --git a/drivers/dma/of-dma.c b/drivers/dma/of-dma.c index 69d04d28b1ef..7aa0864cd487 100644 --- a/drivers/dma/of-dma.c +++ b/drivers/dma/of-dma.c @@ -13,43 +13,31 @@ #include #include #include -#include +#include #include #include #include static LIST_HEAD(of_dma_list); -static DEFINE_SPINLOCK(of_dma_lock); +static DEFINE_MUTEX(of_dma_lock); /** - * of_dma_get_controller - Get a DMA controller in DT DMA helpers list + * of_dma_find_controller - Get a DMA controller in DT DMA helpers list * @dma_spec: pointer to DMA specifier as found in the device tree * * Finds a DMA controller with matching device node and number for dma cells - * in a list of registered DMA controllers. If a match is found the use_count - * variable is increased and a valid pointer to the DMA data stored is retuned. - * A NULL pointer is returned if no match is found. + * in a list of registered DMA controllers. If a match is found a valid pointer + * to the DMA data stored is retuned. A NULL pointer is returned if no match is + * found. */ -static struct of_dma *of_dma_get_controller(struct of_phandle_args *dma_spec) +static struct of_dma *of_dma_find_controller(struct of_phandle_args *dma_spec) { struct of_dma *ofdma; - spin_lock(&of_dma_lock); - - if (list_empty(&of_dma_list)) { - spin_unlock(&of_dma_lock); - return NULL; - } - list_for_each_entry(ofdma, &of_dma_list, of_dma_controllers) if ((ofdma->of_node == dma_spec->np) && - (ofdma->of_dma_nbcells == dma_spec->args_count)) { - ofdma->use_count++; - spin_unlock(&of_dma_lock); + (ofdma->of_dma_nbcells == dma_spec->args_count)) return ofdma; - } - - spin_unlock(&of_dma_lock); pr_debug("%s: can't find DMA controller %s\n", __func__, dma_spec->np->full_name); @@ -57,22 +45,6 @@ static struct of_dma *of_dma_get_controller(struct of_phandle_args *dma_spec) return NULL; } -/** - * of_dma_put_controller - Decrement use count for a registered DMA controller - * @of_dma: pointer to DMA controller data - * - * Decrements the use_count variable in the DMA data structure. This function - * should be called only when a valid pointer is returned from - * of_dma_get_controller() and no further accesses to data referenced by that - * pointer are needed. - */ -static void of_dma_put_controller(struct of_dma *ofdma) -{ - spin_lock(&of_dma_lock); - ofdma->use_count--; - spin_unlock(&of_dma_lock); -} - /** * of_dma_controller_register - Register a DMA controller to DT DMA helpers * @np: device node of DMA controller @@ -93,6 +65,7 @@ int of_dma_controller_register(struct device_node *np, { struct of_dma *ofdma; int nbcells; + const __be32 *prop; if (!np || !of_dma_xlate) { pr_err("%s: not enough information provided\n", __func__); @@ -103,8 +76,11 @@ int of_dma_controller_register(struct device_node *np, if (!ofdma) return -ENOMEM; - nbcells = be32_to_cpup(of_get_property(np, "#dma-cells", NULL)); - if (!nbcells) { + prop = of_get_property(np, "#dma-cells", NULL); + if (prop) + nbcells = be32_to_cpup(prop); + + if (!prop || !nbcells) { pr_err("%s: #dma-cells property is missing or invalid\n", __func__); kfree(ofdma); @@ -115,12 +91,11 @@ int of_dma_controller_register(struct device_node *np, ofdma->of_dma_nbcells = nbcells; ofdma->of_dma_xlate = of_dma_xlate; ofdma->of_dma_data = data; - ofdma->use_count = 0; /* Now queue of_dma controller structure in list */ - spin_lock(&of_dma_lock); + mutex_lock(&of_dma_lock); list_add_tail(&ofdma->of_dma_controllers, &of_dma_list); - spin_unlock(&of_dma_lock); + mutex_unlock(&of_dma_lock); return 0; } @@ -132,32 +107,20 @@ EXPORT_SYMBOL_GPL(of_dma_controller_register); * * Memory allocated by of_dma_controller_register() is freed here. */ -int of_dma_controller_free(struct device_node *np) +void of_dma_controller_free(struct device_node *np) { struct of_dma *ofdma; - spin_lock(&of_dma_lock); - - if (list_empty(&of_dma_list)) { - spin_unlock(&of_dma_lock); - return -ENODEV; - } + mutex_lock(&of_dma_lock); list_for_each_entry(ofdma, &of_dma_list, of_dma_controllers) if (ofdma->of_node == np) { - if (ofdma->use_count) { - spin_unlock(&of_dma_lock); - return -EBUSY; - } - list_del(&ofdma->of_dma_controllers); - spin_unlock(&of_dma_lock); kfree(ofdma); - return 0; + break; } - spin_unlock(&of_dma_lock); - return -ENODEV; + mutex_unlock(&of_dma_lock); } EXPORT_SYMBOL_GPL(of_dma_controller_free); @@ -172,8 +135,8 @@ EXPORT_SYMBOL_GPL(of_dma_controller_free); * specifiers, matches the name provided. Returns 0 if the name matches and * a valid pointer to the DMA specifier is found. Otherwise returns -ENODEV. */ -static int of_dma_match_channel(struct device_node *np, char *name, int index, - struct of_phandle_args *dma_spec) +static int of_dma_match_channel(struct device_node *np, const char *name, + int index, struct of_phandle_args *dma_spec) { const char *s; @@ -198,7 +161,7 @@ static int of_dma_match_channel(struct device_node *np, char *name, int index, * Returns pointer to appropriate dma channel on success or NULL on error. */ struct dma_chan *of_dma_request_slave_channel(struct device_node *np, - char *name) + const char *name) { struct of_phandle_args dma_spec; struct of_dma *ofdma; @@ -220,14 +183,15 @@ struct dma_chan *of_dma_request_slave_channel(struct device_node *np, if (of_dma_match_channel(np, name, i, &dma_spec)) continue; - ofdma = of_dma_get_controller(&dma_spec); - - if (!ofdma) - continue; + mutex_lock(&of_dma_lock); + ofdma = of_dma_find_controller(&dma_spec); - chan = ofdma->of_dma_xlate(&dma_spec, ofdma); + if (ofdma) + chan = ofdma->of_dma_xlate(&dma_spec, ofdma); + else + chan = NULL; - of_dma_put_controller(ofdma); + mutex_unlock(&of_dma_lock); of_node_put(dma_spec.np); diff --git a/drivers/dma/omap-dma.c b/drivers/dma/omap-dma.c index 08b43bf37158..ec3fc4fd9160 100644 --- a/drivers/dma/omap-dma.c +++ b/drivers/dma/omap-dma.c @@ -16,6 +16,8 @@ #include #include #include +#include +#include #include "virt-dma.h" @@ -67,6 +69,10 @@ static const unsigned es_bytes[] = { [OMAP_DMA_DATA_TYPE_S32] = 4, }; +static struct of_dma_filter_info omap_dma_info = { + .filter_fn = omap_dma_filter_fn, +}; + static inline struct omap_dmadev *to_omap_dma_dev(struct dma_device *d) { return container_of(d, struct omap_dmadev, ddev); @@ -629,8 +635,22 @@ static int omap_dma_probe(struct platform_device *pdev) pr_warn("OMAP-DMA: failed to register slave DMA engine device: %d\n", rc); omap_dma_free(od); - } else { - platform_set_drvdata(pdev, od); + return rc; + } + + platform_set_drvdata(pdev, od); + + if (pdev->dev.of_node) { + omap_dma_info.dma_cap = od->ddev.cap_mask; + + /* Device-tree DMA controller registration */ + rc = of_dma_controller_register(pdev->dev.of_node, + of_dma_simple_xlate, &omap_dma_info); + if (rc) { + pr_warn("OMAP-DMA: failed to register DMA controller\n"); + dma_async_device_unregister(&od->ddev); + omap_dma_free(od); + } } dev_info(&pdev->dev, "OMAP DMA engine driver\n"); @@ -642,18 +662,32 @@ static int omap_dma_remove(struct platform_device *pdev) { struct omap_dmadev *od = platform_get_drvdata(pdev); + if (pdev->dev.of_node) + of_dma_controller_free(pdev->dev.of_node); + dma_async_device_unregister(&od->ddev); omap_dma_free(od); return 0; } +static const struct of_device_id omap_dma_match[] = { + { .compatible = "ti,omap2420-sdma", }, + { .compatible = "ti,omap2430-sdma", }, + { .compatible = "ti,omap3430-sdma", }, + { .compatible = "ti,omap3630-sdma", }, + { .compatible = "ti,omap4430-sdma", }, + {}, +}; +MODULE_DEVICE_TABLE(of, omap_dma_match); + static struct platform_driver omap_dma_driver = { .probe = omap_dma_probe, .remove = omap_dma_remove, .driver = { .name = "omap-dma-engine", .owner = THIS_MODULE, + .of_match_table = of_match_ptr(omap_dma_match), }, }; diff --git a/drivers/dma/pch_dma.c b/drivers/dma/pch_dma.c index d01faeb0f27c..ce3dc3e9688c 100644 --- a/drivers/dma/pch_dma.c +++ b/drivers/dma/pch_dma.c @@ -476,7 +476,7 @@ static struct pch_dma_desc *pdc_desc_get(struct pch_dma_chan *pd_chan) dev_dbg(chan2dev(&pd_chan->chan), "scanned %d descriptors\n", i); if (!ret) { - ret = pdc_alloc_desc(&pd_chan->chan, GFP_NOIO); + ret = pdc_alloc_desc(&pd_chan->chan, GFP_ATOMIC); if (ret) { spin_lock(&pd_chan->lock); pd_chan->descs_allocated++; diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index 5dbc5946c4c3..a17553f7c028 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -26,6 +26,7 @@ #include #include #include +#include #include "dmaengine.h" #define PL330_MAX_CHAN 8 @@ -2288,13 +2289,12 @@ static inline void fill_queue(struct dma_pl330_chan *pch) /* If already submitted */ if (desc->status == BUSY) - break; + continue; ret = pl330_submit_req(pch->pl330_chid, &desc->req); if (!ret) { desc->status = BUSY; - break; } else if (ret == -EAGAIN) { /* QFull or DMAC Dying */ break; @@ -2904,9 +2904,9 @@ pl330_probe(struct amba_device *adev, const struct amba_id *id) pi->mcbufsz = pdat ? pdat->mcbuf_sz : 0; res = &adev->res; - pi->base = devm_request_and_ioremap(&adev->dev, res); - if (!pi->base) - return -ENXIO; + pi->base = devm_ioremap_resource(&adev->dev, res); + if (IS_ERR(pi->base)) + return PTR_ERR(pi->base); amba_set_drvdata(adev, pdmac); diff --git a/drivers/dma/sh/Kconfig b/drivers/dma/sh/Kconfig new file mode 100644 index 000000000000..5c1dee20c13e --- /dev/null +++ b/drivers/dma/sh/Kconfig @@ -0,0 +1,24 @@ +# +# DMA engine configuration for sh +# + +config SH_DMAE_BASE + bool "Renesas SuperH DMA Engine support" + depends on (SUPERH && SH_DMA) || (ARM && ARCH_SHMOBILE) + depends on !SH_DMA_API + default y + select DMA_ENGINE + help + Enable support for the Renesas SuperH DMA controllers. + +config SH_DMAE + tristate "Renesas SuperH DMAC support" + depends on SH_DMAE_BASE + help + Enable support for the Renesas SuperH DMA controllers. + +config SUDMAC + tristate "Renesas SUDMAC support" + depends on SH_DMAE_BASE + help + Enable support for the Renesas SUDMAC controllers. diff --git a/drivers/dma/sh/Makefile b/drivers/dma/sh/Makefile index 54ae9572b0ac..c07ca4612e46 100644 --- a/drivers/dma/sh/Makefile +++ b/drivers/dma/sh/Makefile @@ -1,2 +1,3 @@ -obj-$(CONFIG_SH_DMAE) += shdma-base.o +obj-$(CONFIG_SH_DMAE_BASE) += shdma-base.o obj-$(CONFIG_SH_DMAE) += shdma.o +obj-$(CONFIG_SUDMAC) += sudmac.o diff --git a/drivers/dma/sh/sudmac.c b/drivers/dma/sh/sudmac.c new file mode 100644 index 000000000000..e7c94bbddb53 --- /dev/null +++ b/drivers/dma/sh/sudmac.c @@ -0,0 +1,428 @@ +/* + * Renesas SUDMAC support + * + * Copyright (C) 2013 Renesas Solutions Corp. + * + * based on drivers/dma/sh/shdma.c: + * Copyright (C) 2011-2012 Guennadi Liakhovetski + * Copyright (C) 2009 Nobuhiro Iwamatsu + * Copyright (C) 2009 Renesas Solutions, Inc. All rights reserved. + * Copyright (C) 2007 Freescale Semiconductor, Inc. All rights reserved. + * + * This is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include +#include +#include + +struct sudmac_chan { + struct shdma_chan shdma_chan; + void __iomem *base; + char dev_id[16]; /* unique name per DMAC of channel */ + + u32 offset; /* for CFG, BA, BBC, CA, CBC, DEN */ + u32 cfg; + u32 dint_end_bit; +}; + +struct sudmac_device { + struct shdma_dev shdma_dev; + struct sudmac_pdata *pdata; + void __iomem *chan_reg; +}; + +struct sudmac_regs { + u32 base_addr; + u32 base_byte_count; +}; + +struct sudmac_desc { + struct sudmac_regs hw; + struct shdma_desc shdma_desc; +}; + +#define to_chan(schan) container_of(schan, struct sudmac_chan, shdma_chan) +#define to_desc(sdesc) container_of(sdesc, struct sudmac_desc, shdma_desc) +#define to_sdev(sc) container_of(sc->shdma_chan.dma_chan.device, \ + struct sudmac_device, shdma_dev.dma_dev) + +/* SUDMAC register */ +#define SUDMAC_CH0CFG 0x00 +#define SUDMAC_CH0BA 0x10 +#define SUDMAC_CH0BBC 0x18 +#define SUDMAC_CH0CA 0x20 +#define SUDMAC_CH0CBC 0x28 +#define SUDMAC_CH0DEN 0x30 +#define SUDMAC_DSTSCLR 0x38 +#define SUDMAC_DBUFCTRL 0x3C +#define SUDMAC_DINTCTRL 0x40 +#define SUDMAC_DINTSTS 0x44 +#define SUDMAC_DINTSTSCLR 0x48 +#define SUDMAC_CH0SHCTRL 0x50 + +/* Definitions for the sudmac_channel.config */ +#define SUDMAC_SENDBUFM 0x1000 /* b12: Transmit Buffer Mode */ +#define SUDMAC_RCVENDM 0x0100 /* b8: Receive Data Transfer End Mode */ +#define SUDMAC_LBA_WAIT 0x0030 /* b5-4: Local Bus Access Wait */ + +/* Definitions for the sudmac_channel.dint_end_bit */ +#define SUDMAC_CH1ENDE 0x0002 /* b1: Ch1 DMA Transfer End Int Enable */ +#define SUDMAC_CH0ENDE 0x0001 /* b0: Ch0 DMA Transfer End Int Enable */ + +#define SUDMAC_DRV_NAME "sudmac" + +static void sudmac_writel(struct sudmac_chan *sc, u32 data, u32 reg) +{ + iowrite32(data, sc->base + reg); +} + +static u32 sudmac_readl(struct sudmac_chan *sc, u32 reg) +{ + return ioread32(sc->base + reg); +} + +static bool sudmac_is_busy(struct sudmac_chan *sc) +{ + u32 den = sudmac_readl(sc, SUDMAC_CH0DEN + sc->offset); + + if (den) + return true; /* working */ + + return false; /* waiting */ +} + +static void sudmac_set_reg(struct sudmac_chan *sc, struct sudmac_regs *hw, + struct shdma_desc *sdesc) +{ + sudmac_writel(sc, sc->cfg, SUDMAC_CH0CFG + sc->offset); + sudmac_writel(sc, hw->base_addr, SUDMAC_CH0BA + sc->offset); + sudmac_writel(sc, hw->base_byte_count, SUDMAC_CH0BBC + sc->offset); +} + +static void sudmac_start(struct sudmac_chan *sc) +{ + u32 dintctrl = sudmac_readl(sc, SUDMAC_DINTCTRL); + + sudmac_writel(sc, dintctrl | sc->dint_end_bit, SUDMAC_DINTCTRL); + sudmac_writel(sc, 1, SUDMAC_CH0DEN + sc->offset); +} + +static void sudmac_start_xfer(struct shdma_chan *schan, + struct shdma_desc *sdesc) +{ + struct sudmac_chan *sc = to_chan(schan); + struct sudmac_desc *sd = to_desc(sdesc); + + sudmac_set_reg(sc, &sd->hw, sdesc); + sudmac_start(sc); +} + +static bool sudmac_channel_busy(struct shdma_chan *schan) +{ + struct sudmac_chan *sc = to_chan(schan); + + return sudmac_is_busy(sc); +} + +static void sudmac_setup_xfer(struct shdma_chan *schan, int slave_id) +{ +} + +static const struct sudmac_slave_config *sudmac_find_slave( + struct sudmac_chan *sc, int slave_id) +{ + struct sudmac_device *sdev = to_sdev(sc); + struct sudmac_pdata *pdata = sdev->pdata; + const struct sudmac_slave_config *cfg; + int i; + + for (i = 0, cfg = pdata->slave; i < pdata->slave_num; i++, cfg++) + if (cfg->slave_id == slave_id) + return cfg; + + return NULL; +} + +static int sudmac_set_slave(struct shdma_chan *schan, int slave_id, bool try) +{ + struct sudmac_chan *sc = to_chan(schan); + const struct sudmac_slave_config *cfg = sudmac_find_slave(sc, slave_id); + + if (!cfg) + return -ENODEV; + + return 0; +} + +static inline void sudmac_dma_halt(struct sudmac_chan *sc) +{ + u32 dintctrl = sudmac_readl(sc, SUDMAC_DINTCTRL); + + sudmac_writel(sc, 0, SUDMAC_CH0DEN + sc->offset); + sudmac_writel(sc, dintctrl & ~sc->dint_end_bit, SUDMAC_DINTCTRL); + sudmac_writel(sc, sc->dint_end_bit, SUDMAC_DINTSTSCLR); +} + +static int sudmac_desc_setup(struct shdma_chan *schan, + struct shdma_desc *sdesc, + dma_addr_t src, dma_addr_t dst, size_t *len) +{ + struct sudmac_chan *sc = to_chan(schan); + struct sudmac_desc *sd = to_desc(sdesc); + + dev_dbg(sc->shdma_chan.dev, "%s: src=%x, dst=%x, len=%d\n", + __func__, src, dst, *len); + + if (*len > schan->max_xfer_len) + *len = schan->max_xfer_len; + + if (dst) + sd->hw.base_addr = dst; + else if (src) + sd->hw.base_addr = src; + sd->hw.base_byte_count = *len; + + return 0; +} + +static void sudmac_halt(struct shdma_chan *schan) +{ + struct sudmac_chan *sc = to_chan(schan); + + sudmac_dma_halt(sc); +} + +static bool sudmac_chan_irq(struct shdma_chan *schan, int irq) +{ + struct sudmac_chan *sc = to_chan(schan); + u32 dintsts = sudmac_readl(sc, SUDMAC_DINTSTS); + + if (!(dintsts & sc->dint_end_bit)) + return false; + + /* DMA stop */ + sudmac_dma_halt(sc); + + return true; +} + +static size_t sudmac_get_partial(struct shdma_chan *schan, + struct shdma_desc *sdesc) +{ + struct sudmac_chan *sc = to_chan(schan); + struct sudmac_desc *sd = to_desc(sdesc); + u32 current_byte_count = sudmac_readl(sc, SUDMAC_CH0CBC + sc->offset); + + return sd->hw.base_byte_count - current_byte_count; +} + +static bool sudmac_desc_completed(struct shdma_chan *schan, + struct shdma_desc *sdesc) +{ + struct sudmac_chan *sc = to_chan(schan); + struct sudmac_desc *sd = to_desc(sdesc); + u32 current_addr = sudmac_readl(sc, SUDMAC_CH0CA + sc->offset); + + return sd->hw.base_addr + sd->hw.base_byte_count == current_addr; +} + +static int sudmac_chan_probe(struct sudmac_device *su_dev, int id, int irq, + unsigned long flags) +{ + struct shdma_dev *sdev = &su_dev->shdma_dev; + struct platform_device *pdev = to_platform_device(sdev->dma_dev.dev); + struct sudmac_chan *sc; + struct shdma_chan *schan; + int err; + + sc = devm_kzalloc(&pdev->dev, sizeof(struct sudmac_chan), GFP_KERNEL); + if (!sc) { + dev_err(sdev->dma_dev.dev, + "No free memory for allocating dma channels!\n"); + return -ENOMEM; + } + + schan = &sc->shdma_chan; + schan->max_xfer_len = 64 * 1024 * 1024 - 1; + + shdma_chan_probe(sdev, schan, id); + + sc->base = su_dev->chan_reg; + + /* get platform_data */ + sc->offset = su_dev->pdata->channel->offset; + if (su_dev->pdata->channel->config & SUDMAC_TX_BUFFER_MODE) + sc->cfg |= SUDMAC_SENDBUFM; + if (su_dev->pdata->channel->config & SUDMAC_RX_END_MODE) + sc->cfg |= SUDMAC_RCVENDM; + sc->cfg |= (su_dev->pdata->channel->wait << 4) & SUDMAC_LBA_WAIT; + + if (su_dev->pdata->channel->dint_end_bit & SUDMAC_DMA_BIT_CH0) + sc->dint_end_bit |= SUDMAC_CH0ENDE; + if (su_dev->pdata->channel->dint_end_bit & SUDMAC_DMA_BIT_CH1) + sc->dint_end_bit |= SUDMAC_CH1ENDE; + + /* set up channel irq */ + if (pdev->id >= 0) + snprintf(sc->dev_id, sizeof(sc->dev_id), "sudmac%d.%d", + pdev->id, id); + else + snprintf(sc->dev_id, sizeof(sc->dev_id), "sudmac%d", id); + + err = shdma_request_irq(schan, irq, flags, sc->dev_id); + if (err) { + dev_err(sdev->dma_dev.dev, + "DMA channel %d request_irq failed %d\n", id, err); + goto err_no_irq; + } + + return 0; + +err_no_irq: + /* remove from dmaengine device node */ + shdma_chan_remove(schan); + return err; +} + +static void sudmac_chan_remove(struct sudmac_device *su_dev) +{ + struct dma_device *dma_dev = &su_dev->shdma_dev.dma_dev; + struct shdma_chan *schan; + int i; + + shdma_for_each_chan(schan, &su_dev->shdma_dev, i) { + struct sudmac_chan *sc = to_chan(schan); + + BUG_ON(!schan); + + shdma_free_irq(&sc->shdma_chan); + shdma_chan_remove(schan); + } + dma_dev->chancnt = 0; +} + +static dma_addr_t sudmac_slave_addr(struct shdma_chan *schan) +{ + /* SUDMAC doesn't need the address */ + return 0; +} + +static struct shdma_desc *sudmac_embedded_desc(void *buf, int i) +{ + return &((struct sudmac_desc *)buf)[i].shdma_desc; +} + +static const struct shdma_ops sudmac_shdma_ops = { + .desc_completed = sudmac_desc_completed, + .halt_channel = sudmac_halt, + .channel_busy = sudmac_channel_busy, + .slave_addr = sudmac_slave_addr, + .desc_setup = sudmac_desc_setup, + .set_slave = sudmac_set_slave, + .setup_xfer = sudmac_setup_xfer, + .start_xfer = sudmac_start_xfer, + .embedded_desc = sudmac_embedded_desc, + .chan_irq = sudmac_chan_irq, + .get_partial = sudmac_get_partial, +}; + +static int sudmac_probe(struct platform_device *pdev) +{ + struct sudmac_pdata *pdata = pdev->dev.platform_data; + int err, i; + struct sudmac_device *su_dev; + struct dma_device *dma_dev; + struct resource *chan, *irq_res; + + /* get platform data */ + if (!pdata) + return -ENODEV; + + chan = platform_get_resource(pdev, IORESOURCE_MEM, 0); + irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0); + if (!chan || !irq_res) + return -ENODEV; + + err = -ENOMEM; + su_dev = devm_kzalloc(&pdev->dev, sizeof(struct sudmac_device), + GFP_KERNEL); + if (!su_dev) { + dev_err(&pdev->dev, "Not enough memory\n"); + return err; + } + + dma_dev = &su_dev->shdma_dev.dma_dev; + + su_dev->chan_reg = devm_request_and_ioremap(&pdev->dev, chan); + if (!su_dev->chan_reg) + return err; + + dma_cap_set(DMA_SLAVE, dma_dev->cap_mask); + + su_dev->shdma_dev.ops = &sudmac_shdma_ops; + su_dev->shdma_dev.desc_size = sizeof(struct sudmac_desc); + err = shdma_init(&pdev->dev, &su_dev->shdma_dev, pdata->channel_num); + if (err < 0) + return err; + + /* platform data */ + su_dev->pdata = pdev->dev.platform_data; + + platform_set_drvdata(pdev, su_dev); + + /* Create DMA Channel */ + for (i = 0; i < pdata->channel_num; i++) { + err = sudmac_chan_probe(su_dev, i, irq_res->start, IRQF_SHARED); + if (err) + goto chan_probe_err; + } + + err = dma_async_device_register(&su_dev->shdma_dev.dma_dev); + if (err < 0) + goto chan_probe_err; + + return err; + +chan_probe_err: + sudmac_chan_remove(su_dev); + + platform_set_drvdata(pdev, NULL); + shdma_cleanup(&su_dev->shdma_dev); + + return err; +} + +static int sudmac_remove(struct platform_device *pdev) +{ + struct sudmac_device *su_dev = platform_get_drvdata(pdev); + struct dma_device *dma_dev = &su_dev->shdma_dev.dma_dev; + + dma_async_device_unregister(dma_dev); + sudmac_chan_remove(su_dev); + shdma_cleanup(&su_dev->shdma_dev); + platform_set_drvdata(pdev, NULL); + + return 0; +} + +static struct platform_driver sudmac_driver = { + .driver = { + .owner = THIS_MODULE, + .name = SUDMAC_DRV_NAME, + }, + .probe = sudmac_probe, + .remove = sudmac_remove, +}; +module_platform_driver(sudmac_driver); + +MODULE_AUTHOR("Yoshihiro Shimoda"); +MODULE_DESCRIPTION("Renesas SUDMAC driver"); +MODULE_LICENSE("GPL v2"); +MODULE_ALIAS("platform:" SUDMAC_DRV_NAME); diff --git a/drivers/dma/sirf-dma.c b/drivers/dma/sirf-dma.c index 1d627e2391f4..1765a0a2736d 100644 --- a/drivers/dma/sirf-dma.c +++ b/drivers/dma/sirf-dma.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "dmaengine.h" @@ -78,6 +79,7 @@ struct sirfsoc_dma { struct sirfsoc_dma_chan channels[SIRFSOC_DMA_CHANNELS]; void __iomem *base; int irq; + struct clk *clk; bool is_marco; }; @@ -639,6 +641,12 @@ static int sirfsoc_dma_probe(struct platform_device *op) return -EINVAL; } + sdma->clk = devm_clk_get(dev, NULL); + if (IS_ERR(sdma->clk)) { + dev_err(dev, "failed to get a clock.\n"); + return PTR_ERR(sdma->clk); + } + ret = of_address_to_resource(dn, 0, &res); if (ret) { dev_err(dev, "Error parsing memory region!\n"); @@ -698,6 +706,8 @@ static int sirfsoc_dma_probe(struct platform_device *op) tasklet_init(&sdma->tasklet, sirfsoc_dma_tasklet, (unsigned long)sdma); + clk_prepare_enable(sdma->clk); + /* Register DMA engine */ dev_set_drvdata(dev, sdma); ret = dma_async_device_register(dma); @@ -720,6 +730,7 @@ static int sirfsoc_dma_remove(struct platform_device *op) struct device *dev = &op->dev; struct sirfsoc_dma *sdma = dev_get_drvdata(dev); + clk_disable_unprepare(sdma->clk); dma_async_device_unregister(&sdma->dma); free_irq(sdma->irq, sdma); irq_dispose_mapping(sdma->irq); @@ -742,7 +753,18 @@ static struct platform_driver sirfsoc_dma_driver = { }, }; -module_platform_driver(sirfsoc_dma_driver); +static __init int sirfsoc_dma_init(void) +{ + return platform_driver_register(&sirfsoc_dma_driver); +} + +static void __exit sirfsoc_dma_exit(void) +{ + platform_driver_unregister(&sirfsoc_dma_driver); +} + +subsys_initcall(sirfsoc_dma_init); +module_exit(sirfsoc_dma_exit); MODULE_AUTHOR("Rongjun Ying , " "Barry Song "); diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c index fcee27eae1f6..ce193409ebd3 100644 --- a/drivers/dma/tegra20-apb-dma.c +++ b/drivers/dma/tegra20-apb-dma.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -199,6 +200,7 @@ struct tegra_dma_channel { /* Channel-slave specific configuration */ struct dma_slave_config dma_sconfig; + struct tegra_dma_channel_regs channel_reg; }; /* tegra_dma: Tegra DMA specific information */ @@ -1213,7 +1215,6 @@ static const struct tegra_dma_chip_data tegra20_dma_chip_data = { .support_channel_pause = false, }; -#if defined(CONFIG_OF) /* Tegra30 specific DMA controller information */ static const struct tegra_dma_chip_data tegra30_dma_chip_data = { .nr_channels = 32, @@ -1243,7 +1244,6 @@ static const struct of_device_id tegra_dma_of_match[] = { }, }; MODULE_DEVICE_TABLE(of, tegra_dma_of_match); -#endif static int tegra_dma_probe(struct platform_device *pdev) { @@ -1252,20 +1252,14 @@ static int tegra_dma_probe(struct platform_device *pdev) int ret; int i; const struct tegra_dma_chip_data *cdata = NULL; + const struct of_device_id *match; - if (pdev->dev.of_node) { - const struct of_device_id *match; - match = of_match_device(of_match_ptr(tegra_dma_of_match), - &pdev->dev); - if (!match) { - dev_err(&pdev->dev, "Error: No device match found\n"); - return -ENODEV; - } - cdata = match->data; - } else { - /* If no device tree then fallback to tegra20 */ - cdata = &tegra20_dma_chip_data; + match = of_match_device(tegra_dma_of_match, &pdev->dev); + if (!match) { + dev_err(&pdev->dev, "Error: No device match found\n"); + return -ENODEV; } + cdata = match->data; tdma = devm_kzalloc(&pdev->dev, sizeof(*tdma) + cdata->nr_channels * sizeof(struct tegra_dma_channel), GFP_KERNEL); @@ -1448,11 +1442,74 @@ static int tegra_dma_runtime_resume(struct device *dev) return 0; } +#ifdef CONFIG_PM_SLEEP +static int tegra_dma_pm_suspend(struct device *dev) +{ + struct tegra_dma *tdma = dev_get_drvdata(dev); + int i; + int ret; + + /* Enable clock before accessing register */ + ret = tegra_dma_runtime_resume(dev); + if (ret < 0) + return ret; + + tdma->reg_gen = tdma_read(tdma, TEGRA_APBDMA_GENERAL); + for (i = 0; i < tdma->chip_data->nr_channels; i++) { + struct tegra_dma_channel *tdc = &tdma->channels[i]; + struct tegra_dma_channel_regs *ch_reg = &tdc->channel_reg; + + ch_reg->csr = tdc_read(tdc, TEGRA_APBDMA_CHAN_CSR); + ch_reg->ahb_ptr = tdc_read(tdc, TEGRA_APBDMA_CHAN_AHBPTR); + ch_reg->apb_ptr = tdc_read(tdc, TEGRA_APBDMA_CHAN_APBPTR); + ch_reg->ahb_seq = tdc_read(tdc, TEGRA_APBDMA_CHAN_AHBSEQ); + ch_reg->apb_seq = tdc_read(tdc, TEGRA_APBDMA_CHAN_APBSEQ); + } + + /* Disable clock */ + tegra_dma_runtime_suspend(dev); + return 0; +} + +static int tegra_dma_pm_resume(struct device *dev) +{ + struct tegra_dma *tdma = dev_get_drvdata(dev); + int i; + int ret; + + /* Enable clock before accessing register */ + ret = tegra_dma_runtime_resume(dev); + if (ret < 0) + return ret; + + tdma_write(tdma, TEGRA_APBDMA_GENERAL, tdma->reg_gen); + tdma_write(tdma, TEGRA_APBDMA_CONTROL, 0); + tdma_write(tdma, TEGRA_APBDMA_IRQ_MASK_SET, 0xFFFFFFFFul); + + for (i = 0; i < tdma->chip_data->nr_channels; i++) { + struct tegra_dma_channel *tdc = &tdma->channels[i]; + struct tegra_dma_channel_regs *ch_reg = &tdc->channel_reg; + + tdc_write(tdc, TEGRA_APBDMA_CHAN_APBSEQ, ch_reg->apb_seq); + tdc_write(tdc, TEGRA_APBDMA_CHAN_APBPTR, ch_reg->apb_ptr); + tdc_write(tdc, TEGRA_APBDMA_CHAN_AHBSEQ, ch_reg->ahb_seq); + tdc_write(tdc, TEGRA_APBDMA_CHAN_AHBPTR, ch_reg->ahb_ptr); + tdc_write(tdc, TEGRA_APBDMA_CHAN_CSR, + (ch_reg->csr & ~TEGRA_APBDMA_CSR_ENB)); + } + + /* Disable clock */ + tegra_dma_runtime_suspend(dev); + return 0; +} +#endif + static const struct dev_pm_ops tegra_dma_dev_pm_ops = { #ifdef CONFIG_PM_RUNTIME .runtime_suspend = tegra_dma_runtime_suspend, .runtime_resume = tegra_dma_runtime_resume, #endif + SET_SYSTEM_SLEEP_PM_OPS(tegra_dma_pm_suspend, tegra_dma_pm_resume) }; static struct platform_driver tegra_dmac_driver = { @@ -1460,7 +1517,7 @@ static struct platform_driver tegra_dmac_driver = { .name = "tegra-apbdma", .owner = THIS_MODULE, .pm = &tegra_dma_dev_pm_ops, - .of_match_table = of_match_ptr(tegra_dma_of_match), + .of_match_table = tegra_dma_of_match, }, .probe = tegra_dma_probe, .remove = tegra_dma_remove, diff --git a/drivers/dma/timb_dma.c b/drivers/dma/timb_dma.c index 952f823901a6..26107ba6edb3 100644 --- a/drivers/dma/timb_dma.c +++ b/drivers/dma/timb_dma.c @@ -823,7 +823,7 @@ static struct platform_driver td_driver = { .owner = THIS_MODULE, }, .probe = td_probe, - .remove = __exit_p(td_remove), + .remove = td_remove, }; module_platform_driver(td_driver); diff --git a/drivers/dma/txx9dmac.c b/drivers/dma/txx9dmac.c index 913f55c76c99..a59fb4841d4c 100644 --- a/drivers/dma/txx9dmac.c +++ b/drivers/dma/txx9dmac.c @@ -1190,7 +1190,7 @@ static int __init txx9dmac_chan_probe(struct platform_device *pdev) return 0; } -static int __exit txx9dmac_chan_remove(struct platform_device *pdev) +static int txx9dmac_chan_remove(struct platform_device *pdev) { struct txx9dmac_chan *dc = platform_get_drvdata(pdev); @@ -1252,7 +1252,7 @@ static int __init txx9dmac_probe(struct platform_device *pdev) return 0; } -static int __exit txx9dmac_remove(struct platform_device *pdev) +static int txx9dmac_remove(struct platform_device *pdev) { struct txx9dmac_dev *ddev = platform_get_drvdata(pdev); @@ -1299,14 +1299,14 @@ static const struct dev_pm_ops txx9dmac_dev_pm_ops = { }; static struct platform_driver txx9dmac_chan_driver = { - .remove = __exit_p(txx9dmac_chan_remove), + .remove = txx9dmac_chan_remove, .driver = { .name = "txx9dmac-chan", }, }; static struct platform_driver txx9dmac_driver = { - .remove = __exit_p(txx9dmac_remove), + .remove = txx9dmac_remove, .shutdown = txx9dmac_shutdown, .driver = { .name = "txx9dmac", diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c index 5899a76eec3b..67610a6ebf87 100644 --- a/drivers/edac/edac_mc_sysfs.c +++ b/drivers/edac/edac_mc_sysfs.c @@ -87,7 +87,7 @@ static struct device *mci_pdev; /* * various constants for Memory Controllers */ -static const char *mem_types[] = { +static const char * const mem_types[] = { [MEM_EMPTY] = "Empty", [MEM_RESERVED] = "Reserved", [MEM_UNKNOWN] = "Unknown", @@ -107,7 +107,7 @@ static const char *mem_types[] = { [MEM_RDDR3] = "Registered-DDR3" }; -static const char *dev_types[] = { +static const char * const dev_types[] = { [DEV_UNKNOWN] = "Unknown", [DEV_X1] = "x1", [DEV_X2] = "x2", @@ -118,7 +118,7 @@ static const char *dev_types[] = { [DEV_X64] = "x64" }; -static const char *edac_caps[] = { +static const char * const edac_caps[] = { [EDAC_UNKNOWN] = "Unknown", [EDAC_NONE] = "None", [EDAC_RESERVED] = "Reserved", @@ -327,17 +327,17 @@ static struct device_attribute *dynamic_csrow_dimm_attr[] = { }; /* possible dynamic channel ce_count attribute files */ -DEVICE_CHANNEL(ch0_ce_count, S_IRUGO | S_IWUSR, +DEVICE_CHANNEL(ch0_ce_count, S_IRUGO, channel_ce_count_show, NULL, 0); -DEVICE_CHANNEL(ch1_ce_count, S_IRUGO | S_IWUSR, +DEVICE_CHANNEL(ch1_ce_count, S_IRUGO, channel_ce_count_show, NULL, 1); -DEVICE_CHANNEL(ch2_ce_count, S_IRUGO | S_IWUSR, +DEVICE_CHANNEL(ch2_ce_count, S_IRUGO, channel_ce_count_show, NULL, 2); -DEVICE_CHANNEL(ch3_ce_count, S_IRUGO | S_IWUSR, +DEVICE_CHANNEL(ch3_ce_count, S_IRUGO, channel_ce_count_show, NULL, 3); -DEVICE_CHANNEL(ch4_ce_count, S_IRUGO | S_IWUSR, +DEVICE_CHANNEL(ch4_ce_count, S_IRUGO, channel_ce_count_show, NULL, 4); -DEVICE_CHANNEL(ch5_ce_count, S_IRUGO | S_IWUSR, +DEVICE_CHANNEL(ch5_ce_count, S_IRUGO, channel_ce_count_show, NULL, 5); /* Total possible dynamic ce_count attribute file table */ diff --git a/drivers/extcon/Kconfig b/drivers/extcon/Kconfig index 5168a1324a65..3297301a42d4 100644 --- a/drivers/extcon/Kconfig +++ b/drivers/extcon/Kconfig @@ -16,7 +16,7 @@ comment "Extcon Device Drivers" config EXTCON_GPIO tristate "GPIO extcon support" - depends on GENERIC_GPIO + depends on GPIOLIB help Say Y here to enable GPIO based extcon support. Note that GPIO extcon supports single state per extcon instance. diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c index 27ac423ab25e..7ef316fdc4d9 100644 --- a/drivers/firewire/core-cdev.c +++ b/drivers/firewire/core-cdev.c @@ -389,10 +389,8 @@ static void queue_bus_reset_event(struct client *client) struct bus_reset_event *e; e = kzalloc(sizeof(*e), GFP_KERNEL); - if (e == NULL) { - fw_notice(client->device->card, "out of memory when allocating event\n"); + if (e == NULL) return; - } fill_bus_reset_event(&e->reset, client); @@ -693,10 +691,9 @@ static void handle_request(struct fw_card *card, struct fw_request *request, r = kmalloc(sizeof(*r), GFP_ATOMIC); e = kmalloc(sizeof(*e), GFP_ATOMIC); - if (r == NULL || e == NULL) { - fw_notice(card, "out of memory when allocating event\n"); + if (r == NULL || e == NULL) goto failed; - } + r->card = card; r->request = request; r->data = payload; @@ -930,10 +927,9 @@ static void iso_callback(struct fw_iso_context *context, u32 cycle, struct iso_interrupt_event *e; e = kmalloc(sizeof(*e) + header_length, GFP_ATOMIC); - if (e == NULL) { - fw_notice(context->card, "out of memory when allocating event\n"); + if (e == NULL) return; - } + e->interrupt.type = FW_CDEV_EVENT_ISO_INTERRUPT; e->interrupt.closure = client->iso_closure; e->interrupt.cycle = cycle; @@ -950,10 +946,9 @@ static void iso_mc_callback(struct fw_iso_context *context, struct iso_interrupt_mc_event *e; e = kmalloc(sizeof(*e), GFP_ATOMIC); - if (e == NULL) { - fw_notice(context->card, "out of memory when allocating event\n"); + if (e == NULL) return; - } + e->interrupt.type = FW_CDEV_EVENT_ISO_INTERRUPT_MULTICHANNEL; e->interrupt.closure = client->iso_closure; e->interrupt.completed = fw_iso_buffer_lookup(&client->buffer, @@ -1366,8 +1361,7 @@ static int init_iso_resource(struct client *client, int ret; if ((request->channels == 0 && request->bandwidth == 0) || - request->bandwidth > BANDWIDTH_AVAILABLE_INITIAL || - request->bandwidth < 0) + request->bandwidth > BANDWIDTH_AVAILABLE_INITIAL) return -EINVAL; r = kmalloc(sizeof(*r), GFP_KERNEL); @@ -1582,10 +1576,9 @@ void fw_cdev_handle_phy_packet(struct fw_card *card, struct fw_packet *p) list_for_each_entry(client, &card->phy_receiver_list, phy_receiver_link) { e = kmalloc(sizeof(*e) + 8, GFP_ATOMIC); - if (e == NULL) { - fw_notice(card, "out of memory when allocating event\n"); + if (e == NULL) break; - } + e->phy_packet.closure = client->phy_receiver_closure; e->phy_packet.type = FW_CDEV_EVENT_PHY_PACKET_RECEIVED; e->phy_packet.rcode = RCODE_COMPLETE; diff --git a/drivers/firewire/core-device.c b/drivers/firewire/core-device.c index 03ce7d980c6a..664a6ff0a823 100644 --- a/drivers/firewire/core-device.c +++ b/drivers/firewire/core-device.c @@ -692,10 +692,8 @@ static void create_units(struct fw_device *device) * match the drivers id_tables against it. */ unit = kzalloc(sizeof(*unit), GFP_KERNEL); - if (unit == NULL) { - fw_err(device->card, "out of memory for unit\n"); + if (unit == NULL) continue; - } unit->directory = ci.p + value - 1; unit->device.bus = &fw_bus_type; diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c index 4d565365e476..815b0fcbe918 100644 --- a/drivers/firewire/net.c +++ b/drivers/firewire/net.c @@ -356,10 +356,8 @@ static struct fwnet_fragment_info *fwnet_frag_new( } new = kmalloc(sizeof(*new), GFP_ATOMIC); - if (!new) { - dev_err(&pd->skb->dev->dev, "out of memory\n"); + if (!new) return NULL; - } new->offset = offset; new->len = len; @@ -402,8 +400,6 @@ fail_w_fi: fail_w_new: kfree(new); fail: - dev_err(&net->dev, "out of memory\n"); - return NULL; } @@ -609,7 +605,6 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len, skb = dev_alloc_skb(len + LL_RESERVED_SPACE(net)); if (unlikely(!skb)) { - dev_err(&net->dev, "out of memory\n"); net->stats.rx_dropped++; return -ENOMEM; diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c index 45912e6e0ac2..9e1db6490b9a 100644 --- a/drivers/firewire/ohci.c +++ b/drivers/firewire/ohci.c @@ -54,6 +54,10 @@ #include "core.h" #include "ohci.h" +#define ohci_info(ohci, f, args...) dev_info(ohci->card.device, f, ##args) +#define ohci_notice(ohci, f, args...) dev_notice(ohci->card.device, f, ##args) +#define ohci_err(ohci, f, args...) dev_err(ohci->card.device, f, ##args) + #define DESCRIPTOR_OUTPUT_MORE 0 #define DESCRIPTOR_OUTPUT_LAST (1 << 12) #define DESCRIPTOR_INPUT_MORE (2 << 12) @@ -68,6 +72,8 @@ #define DESCRIPTOR_BRANCH_ALWAYS (3 << 2) #define DESCRIPTOR_WAIT (3 << 0) +#define DESCRIPTOR_CMD (0xf << 12) + struct descriptor { __le16 req_count; __le16 control; @@ -149,10 +155,11 @@ struct context { struct descriptor *last; /* - * The last descriptor in the DMA program. It contains the branch + * The last descriptor block in the DMA program. It contains the branch * address that must be updated upon appending a new descriptor. */ struct descriptor *prev; + int prev_z; descriptor_callback_t callback; @@ -270,7 +277,9 @@ static char ohci_driver_name[] = KBUILD_MODNAME; #define PCI_DEVICE_ID_TI_TSB12LV22 0x8009 #define PCI_DEVICE_ID_TI_TSB12LV26 0x8020 #define PCI_DEVICE_ID_TI_TSB82AA2 0x8025 +#define PCI_DEVICE_ID_VIA_VT630X 0x3044 #define PCI_VENDOR_ID_PINNACLE_SYSTEMS 0x11bd +#define PCI_REV_ID_VIA_VT6306 0x46 #define QUIRK_CYCLE_TIMER 1 #define QUIRK_RESET_PACKET 2 @@ -278,6 +287,8 @@ static char ohci_driver_name[] = KBUILD_MODNAME; #define QUIRK_NO_1394A 8 #define QUIRK_NO_MSI 16 #define QUIRK_TI_SLLZ059 32 +#define QUIRK_IR_WAKE 64 +#define QUIRK_PHY_LCTRL_TIMEOUT 128 /* In case of multiple matches in ohci_quirks[], only the first one is used. */ static const struct { @@ -290,7 +301,10 @@ static const struct { QUIRK_BE_HEADERS}, {PCI_VENDOR_ID_ATT, PCI_DEVICE_ID_AGERE_FW643, 6, - QUIRK_NO_MSI}, + QUIRK_PHY_LCTRL_TIMEOUT | QUIRK_NO_MSI}, + + {PCI_VENDOR_ID_ATT, PCI_ANY_ID, PCI_ANY_ID, + QUIRK_PHY_LCTRL_TIMEOUT}, {PCI_VENDOR_ID_CREATIVE, PCI_DEVICE_ID_CREATIVE_SB1394, PCI_ANY_ID, QUIRK_RESET_PACKET}, @@ -319,6 +333,9 @@ static const struct { {PCI_VENDOR_ID_TI, PCI_ANY_ID, PCI_ANY_ID, QUIRK_RESET_PACKET}, + {PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_VT630X, PCI_REV_ID_VIA_VT6306, + QUIRK_CYCLE_TIMER | QUIRK_IR_WAKE}, + {PCI_VENDOR_ID_VIA, PCI_ANY_ID, PCI_ANY_ID, QUIRK_CYCLE_TIMER | QUIRK_NO_MSI}, }; @@ -333,6 +350,8 @@ MODULE_PARM_DESC(quirks, "Chip quirks (default = 0" ", no 1394a enhancements = " __stringify(QUIRK_NO_1394A) ", disable MSI = " __stringify(QUIRK_NO_MSI) ", TI SLLZ059 erratum = " __stringify(QUIRK_TI_SLLZ059) + ", IR wake unreliable = " __stringify(QUIRK_IR_WAKE) + ", phy LCtrl timeout = " __stringify(QUIRK_PHY_LCTRL_TIMEOUT) ")"); #define OHCI_PARAM_DEBUG_AT_AR 1 @@ -359,8 +378,7 @@ static void log_irqs(struct fw_ohci *ohci, u32 evt) !(evt & OHCI1394_busReset)) return; - dev_notice(ohci->card.device, - "IRQ %08x%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", evt, + ohci_notice(ohci, "IRQ %08x%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", evt, evt & OHCI1394_selfIDComplete ? " selfID" : "", evt & OHCI1394_RQPkt ? " AR_req" : "", evt & OHCI1394_RSPkt ? " AR_resp" : "", @@ -406,21 +424,19 @@ static void log_selfids(struct fw_ohci *ohci, int generation, int self_id_count) if (likely(!(param_debug & OHCI_PARAM_DEBUG_SELFIDS))) return; - dev_notice(ohci->card.device, - "%d selfIDs, generation %d, local node ID %04x\n", - self_id_count, generation, ohci->node_id); + ohci_notice(ohci, "%d selfIDs, generation %d, local node ID %04x\n", + self_id_count, generation, ohci->node_id); for (s = ohci->self_id_buffer; self_id_count--; ++s) if ((*s & 1 << 23) == 0) - dev_notice(ohci->card.device, - "selfID 0: %08x, phy %d [%c%c%c] " - "%s gc=%d %s %s%s%s\n", + ohci_notice(ohci, + "selfID 0: %08x, phy %d [%c%c%c] %s gc=%d %s %s%s%s\n", *s, *s >> 24 & 63, _p(s, 6), _p(s, 4), _p(s, 2), speed[*s >> 14 & 3], *s >> 16 & 63, power[*s >> 8 & 7], *s >> 22 & 1 ? "L" : "", *s >> 11 & 1 ? "c" : "", *s & 2 ? "i" : ""); else - dev_notice(ohci->card.device, + ohci_notice(ohci, "selfID n: %08x, phy %d [%c%c%c%c%c%c%c%c]\n", *s, *s >> 24 & 63, _p(s, 16), _p(s, 14), _p(s, 12), _p(s, 10), @@ -470,9 +486,8 @@ static void log_ar_at_event(struct fw_ohci *ohci, evt = 0x1f; if (evt == OHCI1394_evt_bus_reset) { - dev_notice(ohci->card.device, - "A%c evt_bus_reset, generation %d\n", - dir, (header[2] >> 16) & 0xff); + ohci_notice(ohci, "A%c evt_bus_reset, generation %d\n", + dir, (header[2] >> 16) & 0xff); return; } @@ -491,32 +506,26 @@ static void log_ar_at_event(struct fw_ohci *ohci, switch (tcode) { case 0xa: - dev_notice(ohci->card.device, - "A%c %s, %s\n", - dir, evts[evt], tcodes[tcode]); + ohci_notice(ohci, "A%c %s, %s\n", + dir, evts[evt], tcodes[tcode]); break; case 0xe: - dev_notice(ohci->card.device, - "A%c %s, PHY %08x %08x\n", - dir, evts[evt], header[1], header[2]); + ohci_notice(ohci, "A%c %s, PHY %08x %08x\n", + dir, evts[evt], header[1], header[2]); break; case 0x0: case 0x1: case 0x4: case 0x5: case 0x9: - dev_notice(ohci->card.device, - "A%c spd %x tl %02x, " - "%04x -> %04x, %s, " - "%s, %04x%08x%s\n", - dir, speed, header[0] >> 10 & 0x3f, - header[1] >> 16, header[0] >> 16, evts[evt], - tcodes[tcode], header[1] & 0xffff, header[2], specific); + ohci_notice(ohci, + "A%c spd %x tl %02x, %04x -> %04x, %s, %s, %04x%08x%s\n", + dir, speed, header[0] >> 10 & 0x3f, + header[1] >> 16, header[0] >> 16, evts[evt], + tcodes[tcode], header[1] & 0xffff, header[2], specific); break; default: - dev_notice(ohci->card.device, - "A%c spd %x tl %02x, " - "%04x -> %04x, %s, " - "%s%s\n", - dir, speed, header[0] >> 10 & 0x3f, - header[1] >> 16, header[0] >> 16, evts[evt], - tcodes[tcode], specific); + ohci_notice(ohci, + "A%c spd %x tl %02x, %04x -> %04x, %s, %s%s\n", + dir, speed, header[0] >> 10 & 0x3f, + header[1] >> 16, header[0] >> 16, evts[evt], + tcodes[tcode], specific); } } @@ -563,7 +572,8 @@ static int read_phy_reg(struct fw_ohci *ohci, int addr) if (i >= 3) msleep(1); } - dev_err(ohci->card.device, "failed to read phy reg\n"); + ohci_err(ohci, "failed to read phy reg %d\n", addr); + dump_stack(); return -EBUSY; } @@ -585,7 +595,8 @@ static int write_phy_reg(const struct fw_ohci *ohci, int addr, u32 val) if (i >= 3) msleep(1); } - dev_err(ohci->card.device, "failed to write phy reg\n"); + ohci_err(ohci, "failed to write phy reg %d, val %u\n", addr, val); + dump_stack(); return -EBUSY; } @@ -690,8 +701,7 @@ static void ar_context_abort(struct ar_context *ctx, const char *error_msg) reg_write(ohci, CONTROL_CLEAR(ctx->regs), CONTEXT_RUN); flush_writes(ohci); - dev_err(ohci->card.device, "AR error: %s; DMA stopped\n", - error_msg); + ohci_err(ohci, "AR error: %s; DMA stopped\n", error_msg); } /* FIXME: restart? */ } @@ -1157,6 +1167,7 @@ static int context_init(struct context *ctx, struct fw_ohci *ohci, ctx->buffer_tail->used += sizeof(*ctx->buffer_tail->buffer); ctx->last = ctx->buffer_tail->buffer; ctx->prev = ctx->buffer_tail->buffer; + ctx->prev_z = 1; return 0; } @@ -1221,14 +1232,35 @@ static void context_append(struct context *ctx, { dma_addr_t d_bus; struct descriptor_buffer *desc = ctx->buffer_tail; + struct descriptor *d_branch; d_bus = desc->buffer_bus + (d - desc->buffer) * sizeof(*d); desc->used += (z + extra) * sizeof(*d); wmb(); /* finish init of new descriptors before branch_address update */ - ctx->prev->branch_address = cpu_to_le32(d_bus | z); - ctx->prev = find_branch_descriptor(d, z); + + d_branch = find_branch_descriptor(ctx->prev, ctx->prev_z); + d_branch->branch_address = cpu_to_le32(d_bus | z); + + /* + * VT6306 incorrectly checks only the single descriptor at the + * CommandPtr when the wake bit is written, so if it's a + * multi-descriptor block starting with an INPUT_MORE, put a copy of + * the branch address in the first descriptor. + * + * Not doing this for transmit contexts since not sure how it interacts + * with skip addresses. + */ + if (unlikely(ctx->ohci->quirks & QUIRK_IR_WAKE) && + d_branch != ctx->prev && + (ctx->prev->control & cpu_to_le16(DESCRIPTOR_CMD)) == + cpu_to_le16(DESCRIPTOR_INPUT_MORE)) { + ctx->prev->branch_address = cpu_to_le32(d_bus | z); + } + + ctx->prev = d; + ctx->prev_z = z; } static void context_stop(struct context *ctx) @@ -1248,7 +1280,7 @@ static void context_stop(struct context *ctx) if (i) udelay(10); } - dev_err(ohci->card.device, "DMA context still active (0x%08x)\n", reg); + ohci_err(ohci, "DMA context still active (0x%08x)\n", reg); } struct driver_data { @@ -1557,7 +1589,7 @@ static void handle_local_lock(struct fw_ohci *ohci, goto out; } - dev_err(ohci->card.device, "swap not done (CSR lock timeout)\n"); + ohci_err(ohci, "swap not done (CSR lock timeout)\n"); fw_fill_response(&response, packet->header, RCODE_BUSY, NULL, 0); out: @@ -1632,8 +1664,7 @@ static void detect_dead_context(struct fw_ohci *ohci, ctl = reg_read(ohci, CONTROL_SET(regs)); if (ctl & CONTEXT_DEAD) - dev_err(ohci->card.device, - "DMA context %s has stopped, error code: %s\n", + ohci_err(ohci, "DMA context %s has stopped, error code: %s\n", name, evts[ctl & 0x1f]); } @@ -1815,8 +1846,8 @@ static int find_and_insert_self_id(struct fw_ohci *ohci, int self_id_count) reg = reg_read(ohci, OHCI1394_NodeID); if (!(reg & OHCI1394_NodeID_idValid)) { - dev_notice(ohci->card.device, - "node ID not valid, new bus reset in progress\n"); + ohci_notice(ohci, + "node ID not valid, new bus reset in progress\n"); return -EBUSY; } self_id |= ((reg & 0x3f) << 24); /* phy ID */ @@ -1863,12 +1894,12 @@ static void bus_reset_work(struct work_struct *work) reg = reg_read(ohci, OHCI1394_NodeID); if (!(reg & OHCI1394_NodeID_idValid)) { - dev_notice(ohci->card.device, - "node ID not valid, new bus reset in progress\n"); + ohci_notice(ohci, + "node ID not valid, new bus reset in progress\n"); return; } if ((reg & OHCI1394_NodeID_nodeNumber) == 63) { - dev_notice(ohci->card.device, "malconfigured bus\n"); + ohci_notice(ohci, "malconfigured bus\n"); return; } ohci->node_id = reg & (OHCI1394_NodeID_busNumber | @@ -1882,7 +1913,7 @@ static void bus_reset_work(struct work_struct *work) reg = reg_read(ohci, OHCI1394_SelfIDCount); if (reg & OHCI1394_SelfIDCount_selfIDError) { - dev_notice(ohci->card.device, "inconsistent self IDs\n"); + ohci_notice(ohci, "self ID receive error\n"); return; } /* @@ -1894,7 +1925,7 @@ static void bus_reset_work(struct work_struct *work) self_id_count = (reg >> 3) & 0xff; if (self_id_count > 252) { - dev_notice(ohci->card.device, "inconsistent self IDs\n"); + ohci_notice(ohci, "bad selfIDSize (%08x)\n", reg); return; } @@ -1902,7 +1933,10 @@ static void bus_reset_work(struct work_struct *work) rmb(); for (i = 1, j = 0; j < self_id_count; i += 2, j++) { - if (ohci->self_id_cpu[i] != ~ohci->self_id_cpu[i + 1]) { + u32 id = cond_le32_to_cpu(ohci->self_id_cpu[i]); + u32 id2 = cond_le32_to_cpu(ohci->self_id_cpu[i + 1]); + + if (id != ~id2) { /* * If the invalid data looks like a cycle start packet, * it's likely to be the result of the cycle master @@ -1910,33 +1944,30 @@ static void bus_reset_work(struct work_struct *work) * so far are valid and should be processed so that the * bus manager can then correct the gap count. */ - if (cond_le32_to_cpu(ohci->self_id_cpu[i]) - == 0xffff008f) { - dev_notice(ohci->card.device, - "ignoring spurious self IDs\n"); + if (id == 0xffff008f) { + ohci_notice(ohci, "ignoring spurious self IDs\n"); self_id_count = j; break; - } else { - dev_notice(ohci->card.device, - "inconsistent self IDs\n"); - return; } + + ohci_notice(ohci, "bad self ID %d/%d (%08x != ~%08x)\n", + j, self_id_count, id, id2); + return; } - ohci->self_id_buffer[j] = - cond_le32_to_cpu(ohci->self_id_cpu[i]); + ohci->self_id_buffer[j] = id; } if (ohci->quirks & QUIRK_TI_SLLZ059) { self_id_count = find_and_insert_self_id(ohci, self_id_count); if (self_id_count < 0) { - dev_notice(ohci->card.device, - "could not construct local self ID\n"); + ohci_notice(ohci, + "could not construct local self ID\n"); return; } } if (self_id_count == 0) { - dev_notice(ohci->card.device, "inconsistent self IDs\n"); + ohci_notice(ohci, "no self IDs\n"); return; } rmb(); @@ -1957,8 +1988,7 @@ static void bus_reset_work(struct work_struct *work) new_generation = (reg_read(ohci, OHCI1394_SelfIDCount) >> 16) & 0xff; if (new_generation != generation) { - dev_notice(ohci->card.device, - "new bus reset, discarding self ids\n"); + ohci_notice(ohci, "new bus reset, discarding self ids\n"); return; } @@ -2096,7 +2126,7 @@ static irqreturn_t irq_handler(int irq, void *data) } if (unlikely(event & OHCI1394_regAccessFail)) - dev_err(ohci->card.device, "register access failure\n"); + ohci_err(ohci, "register access failure\n"); if (unlikely(event & OHCI1394_postedWriteErr)) { reg_read(ohci, OHCI1394_PostedWriteAddressHi); @@ -2104,13 +2134,12 @@ static irqreturn_t irq_handler(int irq, void *data) reg_write(ohci, OHCI1394_IntEventClear, OHCI1394_postedWriteErr); if (printk_ratelimit()) - dev_err(ohci->card.device, "PCI posted write error\n"); + ohci_err(ohci, "PCI posted write error\n"); } if (unlikely(event & OHCI1394_cycleTooLong)) { if (printk_ratelimit()) - dev_notice(ohci->card.device, - "isochronous cycle too long\n"); + ohci_notice(ohci, "isochronous cycle too long\n"); reg_write(ohci, OHCI1394_LinkControlSet, OHCI1394_LinkControl_cycleMaster); } @@ -2123,8 +2152,7 @@ static irqreturn_t irq_handler(int irq, void *data) * them at least two cycles later. (FIXME?) */ if (printk_ratelimit()) - dev_notice(ohci->card.device, - "isochronous cycle inconsistent\n"); + ohci_notice(ohci, "isochronous cycle inconsistent\n"); } if (unlikely(event & OHCI1394_unrecoverableError)) @@ -2246,12 +2274,11 @@ static int ohci_enable(struct fw_card *card, const __be32 *config_rom, size_t length) { struct fw_ohci *ohci = fw_ohci(card); - struct pci_dev *dev = to_pci_dev(card->device); u32 lps, version, irqs; int i, ret; if (software_reset(ohci)) { - dev_err(card->device, "failed to reset ohci card\n"); + ohci_err(ohci, "failed to reset ohci card\n"); return -EBUSY; } @@ -2262,20 +2289,31 @@ static int ohci_enable(struct fw_card *card, * will lock up the machine. Wait 50msec to make sure we have * full link enabled. However, with some cards (well, at least * a JMicron PCIe card), we have to try again sometimes. + * + * TI TSB82AA2 + TSB81BA3(A) cards signal LPS enabled early but + * cannot actually use the phy at that time. These need tens of + * millisecods pause between LPS write and first phy access too. + * + * But do not wait for 50msec on Agere/LSI cards. Their phy + * arbitration state machine may time out during such a long wait. */ + reg_write(ohci, OHCI1394_HCControlSet, OHCI1394_HCControl_LPS | OHCI1394_HCControl_postedWriteEnable); flush_writes(ohci); - for (lps = 0, i = 0; !lps && i < 3; i++) { + if (!(ohci->quirks & QUIRK_PHY_LCTRL_TIMEOUT)) msleep(50); + + for (lps = 0, i = 0; !lps && i < 150; i++) { + msleep(1); lps = reg_read(ohci, OHCI1394_HCControlSet) & OHCI1394_HCControl_LPS; } if (!lps) { - dev_err(card->device, "failed to set Link Power Status\n"); + ohci_err(ohci, "failed to set Link Power Status\n"); return -EIO; } @@ -2284,7 +2322,7 @@ static int ohci_enable(struct fw_card *card, if (ret < 0) return ret; if (ret) - dev_notice(card->device, "local TSB41BA3D phy\n"); + ohci_notice(ohci, "local TSB41BA3D phy\n"); else ohci->quirks &= ~QUIRK_TI_SLLZ059; } @@ -2382,24 +2420,6 @@ static int ohci_enable(struct fw_card *card, reg_write(ohci, OHCI1394_AsReqFilterHiSet, 0x80000000); - if (!(ohci->quirks & QUIRK_NO_MSI)) - pci_enable_msi(dev); - if (request_irq(dev->irq, irq_handler, - pci_dev_msi_enabled(dev) ? 0 : IRQF_SHARED, - ohci_driver_name, ohci)) { - dev_err(card->device, "failed to allocate interrupt %d\n", - dev->irq); - pci_disable_msi(dev); - - if (config_rom) { - dma_free_coherent(ohci->card.device, CONFIG_ROM_SIZE, - ohci->next_config_rom, - ohci->next_config_rom_bus); - ohci->next_config_rom = NULL; - } - return -EIO; - } - irqs = OHCI1394_reqTxComplete | OHCI1394_respTxComplete | OHCI1394_RQPkt | OHCI1394_RSPkt | OHCI1394_isochTx | OHCI1394_isochRx | @@ -3578,20 +3598,20 @@ static int pci_probe(struct pci_dev *dev, if (!(pci_resource_flags(dev, 0) & IORESOURCE_MEM) || pci_resource_len(dev, 0) < OHCI1394_REGISTER_SIZE) { - dev_err(&dev->dev, "invalid MMIO resource\n"); + ohci_err(ohci, "invalid MMIO resource\n"); err = -ENXIO; goto fail_disable; } err = pci_request_region(dev, 0, ohci_driver_name); if (err) { - dev_err(&dev->dev, "MMIO resource unavailable\n"); + ohci_err(ohci, "MMIO resource unavailable\n"); goto fail_disable; } ohci->registers = pci_iomap(dev, 0, OHCI1394_REGISTER_SIZE); if (ohci->registers == NULL) { - dev_err(&dev->dev, "failed to remap registers\n"); + ohci_err(ohci, "failed to remap registers\n"); err = -ENXIO; goto fail_iomem; } @@ -3675,19 +3695,33 @@ static int pci_probe(struct pci_dev *dev, guid = ((u64) reg_read(ohci, OHCI1394_GUIDHi) << 32) | reg_read(ohci, OHCI1394_GUIDLo); + if (!(ohci->quirks & QUIRK_NO_MSI)) + pci_enable_msi(dev); + if (request_irq(dev->irq, irq_handler, + pci_dev_msi_enabled(dev) ? 0 : IRQF_SHARED, + ohci_driver_name, ohci)) { + ohci_err(ohci, "failed to allocate interrupt %d\n", dev->irq); + err = -EIO; + goto fail_msi; + } + err = fw_card_add(&ohci->card, max_receive, link_speed, guid); if (err) - goto fail_contexts; + goto fail_irq; version = reg_read(ohci, OHCI1394_Version) & 0x00ff00ff; - dev_notice(&dev->dev, - "added OHCI v%x.%x device as card %d, " - "%d IR + %d IT contexts, quirks 0x%x\n", - version >> 16, version & 0xff, ohci->card.index, - ohci->n_ir, ohci->n_it, ohci->quirks); + ohci_notice(ohci, + "added OHCI v%x.%x device as card %d, " + "%d IR + %d IT contexts, quirks 0x%x\n", + version >> 16, version & 0xff, ohci->card.index, + ohci->n_ir, ohci->n_it, ohci->quirks); return 0; + fail_irq: + free_irq(dev->irq, ohci); + fail_msi: + pci_disable_msi(dev); fail_contexts: kfree(ohci->ir_context_list); kfree(ohci->it_context_list); @@ -3711,19 +3745,21 @@ static int pci_probe(struct pci_dev *dev, kfree(ohci); pmac_ohci_off(dev); fail: - if (err == -ENOMEM) - dev_err(&dev->dev, "out of memory\n"); - return err; } static void pci_remove(struct pci_dev *dev) { - struct fw_ohci *ohci; + struct fw_ohci *ohci = pci_get_drvdata(dev); - ohci = pci_get_drvdata(dev); - reg_write(ohci, OHCI1394_IntMaskClear, ~0); - flush_writes(ohci); + /* + * If the removal is happening from the suspend state, LPS won't be + * enabled and host registers (eg., IntMaskClear) won't be accessible. + */ + if (reg_read(ohci, OHCI1394_HCControlSet) & OHCI1394_HCControl_LPS) { + reg_write(ohci, OHCI1394_IntMaskClear, ~0); + flush_writes(ohci); + } cancel_work_sync(&ohci->bus_reset_work); fw_core_remove_card(&ohci->card); @@ -3766,16 +3802,14 @@ static int pci_suspend(struct pci_dev *dev, pm_message_t state) int err; software_reset(ohci); - free_irq(dev->irq, ohci); - pci_disable_msi(dev); err = pci_save_state(dev); if (err) { - dev_err(&dev->dev, "pci_save_state failed\n"); + ohci_err(ohci, "pci_save_state failed\n"); return err; } err = pci_set_power_state(dev, pci_choose_state(dev, state)); if (err) - dev_err(&dev->dev, "pci_set_power_state failed with %d\n", err); + ohci_err(ohci, "pci_set_power_state failed with %d\n", err); pmac_ohci_off(dev); return 0; @@ -3791,7 +3825,7 @@ static int pci_resume(struct pci_dev *dev) pci_restore_state(dev); err = pci_enable_device(dev); if (err) { - dev_err(&dev->dev, "pci_enable_device failed\n"); + ohci_err(ohci, "pci_enable_device failed\n"); return err; } @@ -3837,6 +3871,4 @@ MODULE_DESCRIPTION("Driver for PCI OHCI IEEE1394 controllers"); MODULE_LICENSE("GPL"); /* Provide a module alias so root-on-sbp2 initrds don't break. */ -#ifndef CONFIG_IEEE1394_OHCI1394_MODULE MODULE_ALIAS("ohci1394"); -#endif diff --git a/drivers/firewire/sbp2.c b/drivers/firewire/sbp2.c index 1162d6b3bf85..47674b913843 100644 --- a/drivers/firewire/sbp2.c +++ b/drivers/firewire/sbp2.c @@ -1144,8 +1144,8 @@ static int sbp2_probe(struct device *dev) return -ENODEV; if (dma_get_max_seg_size(device->card->device) > SBP2_MAX_SEG_SIZE) - BUG_ON(dma_set_max_seg_size(device->card->device, - SBP2_MAX_SEG_SIZE)); + WARN_ON(dma_set_max_seg_size(device->card->device, + SBP2_MAX_SEG_SIZE)); shost = scsi_host_alloc(&scsi_driver_template, sizeof(*tgt)); if (shost == NULL) @@ -1475,10 +1475,8 @@ static int sbp2_scsi_queuecommand(struct Scsi_Host *shost, } orb = kzalloc(sizeof(*orb), GFP_ATOMIC); - if (orb == NULL) { - dev_notice(lu_dev(lu), "failed to alloc ORB\n"); + if (orb == NULL) return SCSI_MLQUEUE_HOST_BUSY; - } /* Initialize rcode to something not RCODE_COMPLETE. */ orb->base.rcode = -1; @@ -1636,9 +1634,7 @@ MODULE_LICENSE("GPL"); MODULE_DEVICE_TABLE(ieee1394, sbp2_id_table); /* Provide a module alias so root-on-sbp2 initrds don't break. */ -#ifndef CONFIG_IEEE1394_SBP2_MODULE MODULE_ALIAS("sbp2"); -#endif static int __init sbp2_init(void) { diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index c22eed9481e3..87d567089f13 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -38,7 +38,6 @@ config GPIO_DEVRES menuconfig GPIOLIB bool "GPIO Support" depends on ARCH_WANT_OPTIONAL_GPIOLIB || ARCH_REQUIRE_GPIOLIB - select GENERIC_GPIO help This enables GPIO support through the generic GPIO library. You only need to enable this, if you also want to enable diff --git a/drivers/gpio/gpio-lpc32xx.c b/drivers/gpio/gpio-lpc32xx.c index dda6a756a3d9..90a80eb688a9 100644 --- a/drivers/gpio/gpio-lpc32xx.c +++ b/drivers/gpio/gpio-lpc32xx.c @@ -255,7 +255,7 @@ static int __get_gpo_state_p3(struct lpc32xx_gpio_chip *group, } /* - * GENERIC_GPIO primitives. + * GPIO primitives. */ static int lpc32xx_gpio_dir_input_p012(struct gpio_chip *chip, unsigned pin) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 6961bbeab3ed..264f55099940 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -1685,6 +1685,7 @@ static const struct hid_device_id hid_have_special_driver[] = { { HID_USB_DEVICE(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_NAVIGATION_CONTROLLER) }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS3_CONTROLLER) }, { HID_USB_DEVICE(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_VAIO_VGX_MOUSE) }, + { HID_USB_DEVICE(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_VAIO_VGP_MOUSE) }, { HID_USB_DEVICE(USB_VENDOR_ID_STEELSERIES, USB_DEVICE_ID_STEELSERIES_SRWS1) }, { HID_USB_DEVICE(USB_VENDOR_ID_SUNPLUS, USB_DEVICE_ID_SUNPLUS_WDESKTOP) }, { HID_USB_DEVICE(USB_VENDOR_ID_THINGM, USB_DEVICE_ID_BLINK1) }, @@ -2341,7 +2342,7 @@ struct hid_device *hid_allocate_device(void) init_waitqueue_head(&hdev->debug_wait); INIT_LIST_HEAD(&hdev->debug_list); - mutex_init(&hdev->debug_list_lock); + spin_lock_init(&hdev->debug_list_lock); sema_init(&hdev->driver_lock, 1); sema_init(&hdev->driver_input_lock, 1); diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c index 7e56cb3855e3..8453214ec376 100644 --- a/drivers/hid/hid-debug.c +++ b/drivers/hid/hid-debug.c @@ -579,15 +579,16 @@ void hid_debug_event(struct hid_device *hdev, char *buf) { int i; struct hid_debug_list *list; + unsigned long flags; - mutex_lock(&hdev->debug_list_lock); + spin_lock_irqsave(&hdev->debug_list_lock, flags); list_for_each_entry(list, &hdev->debug_list, node) { for (i = 0; i < strlen(buf); i++) list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] = buf[i]; list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE; } - mutex_unlock(&hdev->debug_list_lock); + spin_unlock_irqrestore(&hdev->debug_list_lock, flags); wake_up_interruptible(&hdev->debug_wait); } @@ -977,6 +978,7 @@ static int hid_debug_events_open(struct inode *inode, struct file *file) { int err = 0; struct hid_debug_list *list; + unsigned long flags; if (!(list = kzalloc(sizeof(struct hid_debug_list), GFP_KERNEL))) { err = -ENOMEM; @@ -992,9 +994,9 @@ static int hid_debug_events_open(struct inode *inode, struct file *file) file->private_data = list; mutex_init(&list->read_mutex); - mutex_lock(&list->hdev->debug_list_lock); + spin_lock_irqsave(&list->hdev->debug_list_lock, flags); list_add_tail(&list->node, &list->hdev->debug_list); - mutex_unlock(&list->hdev->debug_list_lock); + spin_unlock_irqrestore(&list->hdev->debug_list_lock, flags); out: return err; @@ -1088,10 +1090,11 @@ static unsigned int hid_debug_events_poll(struct file *file, poll_table *wait) static int hid_debug_events_release(struct inode *inode, struct file *file) { struct hid_debug_list *list = file->private_data; + unsigned long flags; - mutex_lock(&list->hdev->debug_list_lock); + spin_lock_irqsave(&list->hdev->debug_list_lock, flags); list_del(&list->node); - mutex_unlock(&list->hdev->debug_list_lock); + spin_unlock_irqrestore(&list->hdev->debug_list_lock, flags); kfree(list->hid_debug_buf); kfree(list); diff --git a/drivers/hid/hid-steelseries.c b/drivers/hid/hid-steelseries.c index 9b0efb0083fe..d16491192112 100644 --- a/drivers/hid/hid-steelseries.c +++ b/drivers/hid/hid-steelseries.c @@ -18,7 +18,8 @@ #include "hid-ids.h" -#if defined(CONFIG_LEDS_CLASS) || defined(CONFIG_LEDS_CLASS_MODULE) +#if IS_BUILTIN(CONFIG_LEDS_CLASS) || \ + (IS_MODULE(CONFIG_LEDS_CLASS) && IS_MODULE(CONFIG_HID_STEELSERIES)) #define SRWS1_NUMBER_LEDS 15 struct steelseries_srws1_data { __u16 led_state; @@ -107,7 +108,8 @@ static __u8 steelseries_srws1_rdesc_fixed[] = { 0xC0 /* End Collection */ }; -#if defined(CONFIG_LEDS_CLASS) || defined(CONFIG_LEDS_CLASS_MODULE) +#if IS_BUILTIN(CONFIG_LEDS_CLASS) || \ + (IS_MODULE(CONFIG_LEDS_CLASS) && IS_MODULE(CONFIG_HID_STEELSERIES)) static void steelseries_srws1_set_leds(struct hid_device *hdev, __u16 leds) { struct list_head *report_list = &hdev->report_enum[HID_OUTPUT_REPORT].report_list; @@ -370,7 +372,8 @@ MODULE_DEVICE_TABLE(hid, steelseries_srws1_devices); static struct hid_driver steelseries_srws1_driver = { .name = "steelseries_srws1", .id_table = steelseries_srws1_devices, -#if defined(CONFIG_LEDS_CLASS) || defined(CONFIG_LEDS_CLASS_MODULE) +#if IS_BUILTIN(CONFIG_LEDS_CLASS) || \ + (IS_MODULE(CONFIG_LEDS_CLASS) && IS_MODULE(CONFIG_HID_STEELSERIES)) .probe = steelseries_srws1_probe, .remove = steelseries_srws1_remove, #endif diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index adfee98486b1..631736e2e7ed 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -363,7 +363,7 @@ config I2C_BLACKFIN_TWI_CLK_KHZ config I2C_CBUS_GPIO tristate "CBUS I2C driver" - depends on GENERIC_GPIO + depends on GPIOLIB help Support for CBUS access using I2C API. Mostly relevant for Nokia Internet Tablets (770, N800 and N810). @@ -436,7 +436,7 @@ config I2C_EG20T config I2C_GPIO tristate "GPIO-based bitbanging I2C" - depends on GENERIC_GPIO + depends on GPIOLIB select I2C_ALGOBIT help This is a very simple bitbanging I2C driver utilizing the diff --git a/drivers/i2c/muxes/Kconfig b/drivers/i2c/muxes/Kconfig index 5faf244d2476..f7f9865b8b89 100644 --- a/drivers/i2c/muxes/Kconfig +++ b/drivers/i2c/muxes/Kconfig @@ -7,7 +7,7 @@ menu "Multiplexer I2C Chip support" config I2C_ARB_GPIO_CHALLENGE tristate "GPIO-based I2C arbitration" - depends on GENERIC_GPIO && OF + depends on GPIOLIB && OF help If you say yes to this option, support will be included for an I2C multimaster arbitration scheme using GPIOs and a challenge & @@ -19,7 +19,7 @@ config I2C_ARB_GPIO_CHALLENGE config I2C_MUX_GPIO tristate "GPIO-based I2C multiplexer" - depends on GENERIC_GPIO + depends on GPIOLIB help If you say yes to this option, support will be included for a GPIO based I2C multiplexer. This driver provides access to diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c index 0bb99bb38809..c47c2034ca71 100644 --- a/drivers/infiniband/core/iwcm.c +++ b/drivers/infiniband/core/iwcm.c @@ -878,6 +878,8 @@ static void cm_work_handler(struct work_struct *_work) } return; } + if (empty) + return; spin_lock_irqsave(&cm_id_priv->lock, flags); } spin_unlock_irqrestore(&cm_id_priv->lock, flags); diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index a8fdd3381405..22192deb8828 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -348,7 +348,8 @@ static void __ib_shared_qp_event_handler(struct ib_event *event, void *context) struct ib_qp *qp = context; list_for_each_entry(event->element.qp, &qp->open_list, open_list) - event->element.qp->event_handler(event, event->element.qp->qp_context); + if (event->element.qp->event_handler) + event->element.qp->event_handler(event, event->element.qp->qp_context); } static void __ib_insert_xrcd_qp(struct ib_xrcd *xrcd, struct ib_qp *qp) diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index 9c12da0cbd32..e87f2201b220 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -559,7 +559,7 @@ static int iwch_reregister_phys_mem(struct ib_mr *mr, __be64 *page_list = NULL; int shift = 0; u64 total_size; - int npages; + int npages = 0; int ret; PDBG("%s ib_mr %p ib_pd %p\n", __func__, mr, pd); diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index 5b059e2d80cc..232040447e8a 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -111,6 +111,16 @@ static int alloc_host_sq(struct c4iw_rdev *rdev, struct t4_sq *sq) return 0; } +static int alloc_sq(struct c4iw_rdev *rdev, struct t4_sq *sq, int user) +{ + int ret = -ENOSYS; + if (user) + ret = alloc_oc_sq(rdev, sq); + if (ret) + ret = alloc_host_sq(rdev, sq); + return ret; +} + static int destroy_qp(struct c4iw_rdev *rdev, struct t4_wq *wq, struct c4iw_dev_ucontext *uctx) { @@ -179,15 +189,9 @@ static int create_qp(struct c4iw_rdev *rdev, struct t4_wq *wq, goto free_sw_rq; } - if (user) { - if (alloc_oc_sq(rdev, &wq->sq) && alloc_host_sq(rdev, &wq->sq)) - goto free_hwaddr; - } else { - ret = alloc_host_sq(rdev, &wq->sq); - if (ret) - goto free_hwaddr; - } - + ret = alloc_sq(rdev, &wq->sq, user); + if (ret) + goto free_hwaddr; memset(wq->sq.queue, 0, wq->sq.memsize); dma_unmap_addr_set(&wq->sq, mapping, wq->sq.dma_addr); diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c index ea93870266eb..44ea9390417c 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.c +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c @@ -2187,7 +2187,8 @@ int ipath_register_ib_device(struct ipath_devdata *dd) if (ret) goto err_reg; - if (ipath_verbs_register_sysfs(dev)) + ret = ipath_verbs_register_sysfs(dev); + if (ret) goto err_class; enable_timer(dd); @@ -2327,15 +2328,15 @@ static int ipath_verbs_register_sysfs(struct ib_device *dev) int i; int ret; - for (i = 0; i < ARRAY_SIZE(ipath_class_attributes); ++i) - if (device_create_file(&dev->dev, - ipath_class_attributes[i])) { - ret = 1; + for (i = 0; i < ARRAY_SIZE(ipath_class_attributes); ++i) { + ret = device_create_file(&dev->dev, + ipath_class_attributes[i]); + if (ret) goto bail; - } - - ret = 0; - + } + return 0; bail: + for (i = 0; i < ARRAY_SIZE(ipath_class_attributes); ++i) + device_remove_file(&dev->dev, ipath_class_attributes[i]); return ret; } diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 73b3a7132587..d5e60f44ba5a 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -33,6 +33,7 @@ #include #include +#include #include #include "mlx4_ib.h" @@ -585,6 +586,7 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, struct mlx4_qp *mqp; struct mlx4_ib_wq *wq; struct mlx4_ib_srq *srq; + struct mlx4_srq *msrq = NULL; int is_send; int is_error; u32 g_mlpath_rqpn; @@ -653,6 +655,20 @@ repoll: wc->qp = &(*cur_qp)->ibqp; + if (wc->qp->qp_type == IB_QPT_XRC_TGT) { + u32 srq_num; + g_mlpath_rqpn = be32_to_cpu(cqe->g_mlpath_rqpn); + srq_num = g_mlpath_rqpn & 0xffffff; + /* SRQ is also in the radix tree */ + msrq = mlx4_srq_lookup(to_mdev(cq->ibcq.device)->dev, + srq_num); + if (unlikely(!msrq)) { + pr_warn("CQ %06x with entry for unknown SRQN %06x\n", + cq->mcq.cqn, srq_num); + return -EINVAL; + } + } + if (is_send) { wq = &(*cur_qp)->sq; if (!(*cur_qp)->sq_signal_bits) { @@ -666,6 +682,11 @@ repoll: wqe_ctr = be16_to_cpu(cqe->wqe_index); wc->wr_id = srq->wrid[wqe_ctr]; mlx4_ib_free_srq_wqe(srq, wqe_ctr); + } else if (msrq) { + srq = to_mibsrq(msrq); + wqe_ctr = be16_to_cpu(cqe->wqe_index); + wc->wr_id = srq->wrid[wqe_ctr]; + mlx4_ib_free_srq_wqe(srq, wqe_ctr); } else { wq = &(*cur_qp)->rq; tail = wq->tail & (wq->wqe_cnt - 1); diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 35cced2a4da8..4f10af2905b5 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -1292,6 +1292,8 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) { context->sq_size_stride |= !!qp->sq_no_prefetch << 7; context->xrcd = cpu_to_be32((u32) qp->xrcdn); + if (ibqp->qp_type == IB_QPT_RAW_PACKET) + context->param3 |= cpu_to_be32(1 << 30); } if (qp->ibqp.uobject) @@ -1458,6 +1460,10 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, } } + if (qp->ibqp.qp_type == IB_QPT_RAW_PACKET) + context->pri_path.ackto = (context->pri_path.ackto & 0xf8) | + MLX4_IB_LINK_TYPE_ETH; + if (cur_state == IB_QPS_RTS && new_state == IB_QPS_SQD && attr_mask & IB_QP_EN_SQD_ASYNC_NOTIFY && attr->en_sqd_async_notify) sqd_event = 1; diff --git a/drivers/infiniband/hw/qib/qib_sysfs.c b/drivers/infiniband/hw/qib/qib_sysfs.c index 034cc821de5c..3c8e4e3caca6 100644 --- a/drivers/infiniband/hw/qib/qib_sysfs.c +++ b/drivers/infiniband/hw/qib/qib_sysfs.c @@ -808,10 +808,14 @@ int qib_verbs_register_sysfs(struct qib_devdata *dd) for (i = 0; i < ARRAY_SIZE(qib_attributes); ++i) { ret = device_create_file(&dev->dev, qib_attributes[i]); if (ret) - return ret; + goto bail; } return 0; +bail: + for (i = 0; i < ARRAY_SIZE(qib_attributes); ++i) + device_remove_file(&dev->dev, qib_attributes[i]); + return ret; } /* diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c index 7c0ab16a2fe2..904c384aa361 100644 --- a/drivers/infiniband/hw/qib/qib_verbs.c +++ b/drivers/infiniband/hw/qib/qib_verbs.c @@ -2234,7 +2234,8 @@ int qib_register_ib_device(struct qib_devdata *dd) if (ret) goto err_agents; - if (qib_verbs_register_sysfs(dd)) + ret = qib_verbs_register_sysfs(dd); + if (ret) goto err_class; goto bail; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 554b9063da54..b6e049a3c7a8 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -830,7 +830,7 @@ static int ipoib_hard_header(struct sk_buff *skb, */ memcpy(cb->hwaddr, daddr, INFINIBAND_ALEN); - return 0; + return sizeof *header; } static void ipoib_set_mcast_list(struct net_device *dev) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index 0ab8c9cc3a78..f19b0998a53c 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -82,10 +82,10 @@ module_param_named(max_lun, iscsi_max_lun, uint, S_IRUGO); int iser_debug_level = 0; -MODULE_DESCRIPTION("iSER (iSCSI Extensions for RDMA) Datamover " - "v" DRV_VER " (" DRV_DATE ")"); +MODULE_DESCRIPTION("iSER (iSCSI Extensions for RDMA) Datamover"); MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Alex Nezhinsky, Dan Bar Dov, Or Gerlitz"); +MODULE_VERSION(DRV_VER); module_param_named(debug_level, iser_debug_level, int, 0644); MODULE_PARM_DESC(debug_level, "Enable debug tracing if > 0 (default:disabled)"); @@ -370,8 +370,8 @@ iscsi_iser_conn_bind(struct iscsi_cls_session *cls_session, /* binds the iSER connection retrieved from the previously * connected ep_handle to the iSCSI layer connection. exchanges * connection pointers */ - iser_err("binding iscsi/iser conn %p %p to ib_conn %p\n", - conn, conn->dd_data, ib_conn); + iser_info("binding iscsi/iser conn %p %p to ib_conn %p\n", + conn, conn->dd_data, ib_conn); iser_conn = conn->dd_data; ib_conn->iser_conn = iser_conn; iser_conn->ib_conn = ib_conn; @@ -475,28 +475,28 @@ iscsi_iser_set_param(struct iscsi_cls_conn *cls_conn, case ISCSI_PARAM_HDRDGST_EN: sscanf(buf, "%d", &value); if (value) { - printk(KERN_ERR "DataDigest wasn't negotiated to None"); + iser_err("DataDigest wasn't negotiated to None"); return -EPROTO; } break; case ISCSI_PARAM_DATADGST_EN: sscanf(buf, "%d", &value); if (value) { - printk(KERN_ERR "DataDigest wasn't negotiated to None"); + iser_err("DataDigest wasn't negotiated to None"); return -EPROTO; } break; case ISCSI_PARAM_IFMARKER_EN: sscanf(buf, "%d", &value); if (value) { - printk(KERN_ERR "IFMarker wasn't negotiated to No"); + iser_err("IFMarker wasn't negotiated to No"); return -EPROTO; } break; case ISCSI_PARAM_OFMARKER_EN: sscanf(buf, "%d", &value); if (value) { - printk(KERN_ERR "OFMarker wasn't negotiated to No"); + iser_err("OFMarker wasn't negotiated to No"); return -EPROTO; } break; @@ -596,7 +596,7 @@ iscsi_iser_ep_poll(struct iscsi_endpoint *ep, int timeout_ms) ib_conn->state == ISER_CONN_DOWN)) rc = -1; - iser_err("ib conn %p rc = %d\n", ib_conn, rc); + iser_info("ib conn %p rc = %d\n", ib_conn, rc); if (rc > 0) return 1; /* success, this is the equivalent of POLLOUT */ @@ -623,7 +623,7 @@ iscsi_iser_ep_disconnect(struct iscsi_endpoint *ep) iscsi_suspend_tx(ib_conn->iser_conn->iscsi_conn); - iser_err("ib conn %p state %d\n",ib_conn, ib_conn->state); + iser_info("ib conn %p state %d\n", ib_conn, ib_conn->state); iser_conn_terminate(ib_conn); } @@ -682,7 +682,7 @@ static umode_t iser_attr_is_visible(int param_type, int param) static struct scsi_host_template iscsi_iser_sht = { .module = THIS_MODULE, - .name = "iSCSI Initiator over iSER, v." DRV_VER, + .name = "iSCSI Initiator over iSER", .queuecommand = iscsi_queuecommand, .change_queue_depth = iscsi_change_queue_depth, .sg_tablesize = ISCSI_ISER_SG_TABLESIZE, @@ -740,7 +740,7 @@ static int __init iser_init(void) iser_dbg("Starting iSER datamover...\n"); if (iscsi_max_lun < 1) { - printk(KERN_ERR "Invalid max_lun value of %u\n", iscsi_max_lun); + iser_err("Invalid max_lun value of %u\n", iscsi_max_lun); return -EINVAL; } diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h index 5babdb35bda7..06f578cde75b 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h @@ -42,6 +42,7 @@ #include #include +#include #include #include @@ -65,20 +66,26 @@ #define DRV_NAME "iser" #define PFX DRV_NAME ": " -#define DRV_VER "0.1" -#define DRV_DATE "May 7th, 2006" +#define DRV_VER "1.1" #define iser_dbg(fmt, arg...) \ do { \ - if (iser_debug_level > 1) \ + if (iser_debug_level > 2) \ printk(KERN_DEBUG PFX "%s:" fmt,\ __func__ , ## arg); \ } while (0) #define iser_warn(fmt, arg...) \ + do { \ + if (iser_debug_level > 1) \ + pr_warn(PFX "%s:" fmt, \ + __func__ , ## arg); \ + } while (0) + +#define iser_info(fmt, arg...) \ do { \ if (iser_debug_level > 0) \ - printk(KERN_DEBUG PFX "%s:" fmt,\ + pr_info(PFX "%s:" fmt, \ __func__ , ## arg); \ } while (0) @@ -133,6 +140,15 @@ struct iser_hdr { __be64 read_va; } __attribute__((packed)); + +#define ISER_ZBVA_NOT_SUPPORTED 0x80 +#define ISER_SEND_W_INV_NOT_SUPPORTED 0x40 + +struct iser_cm_hdr { + u8 flags; + u8 rsvd[3]; +} __packed; + /* Constant PDU lengths calculations */ #define ISER_HEADERS_LEN (sizeof(struct iser_hdr) + sizeof(struct iscsi_hdr)) diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index be1edb04b085..68ebb7fe072a 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -416,8 +416,9 @@ int iser_reg_rdma_mem(struct iscsi_iser_task *iser_task, for (i=0 ; ipage_vec->length ; i++) iser_err("page_vec[%d] = 0x%llx\n", i, (unsigned long long) ib_conn->page_vec->pages[i]); - return err; } + if (err) + return err; } return 0; } diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c index 4debadc53106..5278916c3103 100644 --- a/drivers/infiniband/ulp/iser/iser_verbs.c +++ b/drivers/infiniband/ulp/iser/iser_verbs.c @@ -74,8 +74,9 @@ static int iser_create_device_ib_res(struct iser_device *device) struct iser_cq_desc *cq_desc; device->cqs_used = min(ISER_MAX_CQ, device->ib_device->num_comp_vectors); - iser_err("using %d CQs, device %s supports %d vectors\n", device->cqs_used, - device->ib_device->name, device->ib_device->num_comp_vectors); + iser_info("using %d CQs, device %s supports %d vectors\n", + device->cqs_used, device->ib_device->name, + device->ib_device->num_comp_vectors); device->cq_desc = kmalloc(sizeof(struct iser_cq_desc) * device->cqs_used, GFP_KERNEL); @@ -262,7 +263,7 @@ static int iser_create_ib_conn_res(struct iser_conn *ib_conn) min_index = index; device->cq_active_qps[min_index]++; mutex_unlock(&ig.connlist_mutex); - iser_err("cq index %d used for ib_conn %p\n", min_index, ib_conn); + iser_info("cq index %d used for ib_conn %p\n", min_index, ib_conn); init_attr.event_handler = iser_qp_event_callback; init_attr.qp_context = (void *)ib_conn; @@ -280,9 +281,9 @@ static int iser_create_ib_conn_res(struct iser_conn *ib_conn) goto out_err; ib_conn->qp = ib_conn->cma_id->qp; - iser_err("setting conn %p cma_id %p: fmr_pool %p qp %p\n", - ib_conn, ib_conn->cma_id, - ib_conn->fmr_pool, ib_conn->cma_id->qp); + iser_info("setting conn %p cma_id %p: fmr_pool %p qp %p\n", + ib_conn, ib_conn->cma_id, + ib_conn->fmr_pool, ib_conn->cma_id->qp); return ret; out_err: @@ -299,9 +300,9 @@ static int iser_free_ib_conn_res(struct iser_conn *ib_conn, int can_destroy_id) int cq_index; BUG_ON(ib_conn == NULL); - iser_err("freeing conn %p cma_id %p fmr pool %p qp %p\n", - ib_conn, ib_conn->cma_id, - ib_conn->fmr_pool, ib_conn->qp); + iser_info("freeing conn %p cma_id %p fmr pool %p qp %p\n", + ib_conn, ib_conn->cma_id, + ib_conn->fmr_pool, ib_conn->qp); /* qp is created only once both addr & route are resolved */ if (ib_conn->fmr_pool != NULL) @@ -379,7 +380,7 @@ static void iser_device_try_release(struct iser_device *device) { mutex_lock(&ig.device_list_mutex); device->refcount--; - iser_err("device %p refcount %d\n",device,device->refcount); + iser_info("device %p refcount %d\n", device, device->refcount); if (!device->refcount) { iser_free_device_ib_res(device); list_del(&device->ig_list); @@ -498,6 +499,7 @@ static int iser_route_handler(struct rdma_cm_id *cma_id) { struct rdma_conn_param conn_param; int ret; + struct iser_cm_hdr req_hdr; ret = iser_create_ib_conn_res((struct iser_conn *)cma_id->context); if (ret) @@ -509,6 +511,12 @@ static int iser_route_handler(struct rdma_cm_id *cma_id) conn_param.retry_count = 7; conn_param.rnr_retry_count = 6; + memset(&req_hdr, 0, sizeof(req_hdr)); + req_hdr.flags = (ISER_ZBVA_NOT_SUPPORTED | + ISER_SEND_W_INV_NOT_SUPPORTED); + conn_param.private_data = (void *)&req_hdr; + conn_param.private_data_len = sizeof(struct iser_cm_hdr); + ret = rdma_connect(cma_id, &conn_param); if (ret) { iser_err("failure connecting: %d\n", ret); @@ -558,8 +566,8 @@ static int iser_cma_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *eve { int ret = 0; - iser_err("event %d status %d conn %p id %p\n", - event->event, event->status, cma_id->context, cma_id); + iser_info("event %d status %d conn %p id %p\n", + event->event, event->status, cma_id->context, cma_id); switch (event->event) { case RDMA_CM_EVENT_ADDR_RESOLVED: @@ -619,8 +627,8 @@ int iser_connect(struct iser_conn *ib_conn, /* the device is known only --after-- address resolution */ ib_conn->device = NULL; - iser_err("connecting to: %pI4, port 0x%x\n", - &dst_addr->sin_addr, dst_addr->sin_port); + iser_info("connecting to: %pI4, port 0x%x\n", + &dst_addr->sin_addr, dst_addr->sin_port); ib_conn->state = ISER_CONN_PENDING; diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index c09d41b1a2ff..b08ca7a9f76b 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -1374,7 +1374,7 @@ static int srpt_abort_cmd(struct srpt_send_ioctx *ioctx) target_put_sess_cmd(ioctx->ch->sess, &ioctx->cmd); break; default: - WARN_ON("ERROR: unexpected command state"); + WARN(1, "Unexpected command state (%d)", state); break; } diff --git a/drivers/input/keyboard/Kconfig b/drivers/input/keyboard/Kconfig index 6a195d5e90ff..62a2c0e4cc99 100644 --- a/drivers/input/keyboard/Kconfig +++ b/drivers/input/keyboard/Kconfig @@ -175,7 +175,7 @@ config KEYBOARD_EP93XX config KEYBOARD_GPIO tristate "GPIO Buttons" - depends on GENERIC_GPIO + depends on GPIOLIB help This driver implements support for buttons connected to GPIO pins of various CPUs (and some other chips). @@ -190,7 +190,7 @@ config KEYBOARD_GPIO config KEYBOARD_GPIO_POLLED tristate "Polled GPIO buttons" - depends on GENERIC_GPIO + depends on GPIOLIB select INPUT_POLLDEV help This driver implements support for buttons connected @@ -241,7 +241,7 @@ config KEYBOARD_TCA8418 config KEYBOARD_MATRIX tristate "GPIO driven matrix keypad support" - depends on GENERIC_GPIO + depends on GPIOLIB select INPUT_MATRIXKMAP help Enable support for GPIO driven matrix keypad. diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig index af80928a46b4..bb698e1f9e42 100644 --- a/drivers/input/misc/Kconfig +++ b/drivers/input/misc/Kconfig @@ -214,7 +214,7 @@ config INPUT_APANEL config INPUT_GP2A tristate "Sharp GP2AP002A00F I2C Proximity/Opto sensor driver" depends on I2C - depends on GENERIC_GPIO + depends on GPIOLIB help Say Y here if you have a Sharp GP2AP002A00F proximity/als combo-chip hooked to an I2C bus. @@ -224,7 +224,7 @@ config INPUT_GP2A config INPUT_GPIO_TILT_POLLED tristate "Polled GPIO tilt switch" - depends on GENERIC_GPIO + depends on GPIOLIB select INPUT_POLLDEV help This driver implements support for tilt switches connected @@ -472,7 +472,7 @@ config INPUT_PWM_BEEPER config INPUT_GPIO_ROTARY_ENCODER tristate "Rotary encoders connected to GPIO pins" - depends on GPIOLIB && GENERIC_GPIO + depends on GPIOLIB help Say Y here to add support for rotary encoders connected to GPIO lines. Check file:Documentation/input/rotary-encoder.txt for more @@ -484,7 +484,7 @@ config INPUT_GPIO_ROTARY_ENCODER config INPUT_RB532_BUTTON tristate "Mikrotik Routerboard 532 button interface" depends on MIKROTIK_RB532 - depends on GPIOLIB && GENERIC_GPIO + depends on GPIOLIB select INPUT_POLLDEV help Say Y here if you want support for the S1 button built into diff --git a/drivers/input/mouse/Kconfig b/drivers/input/mouse/Kconfig index 802bd6a72d73..effa9c5f2c5c 100644 --- a/drivers/input/mouse/Kconfig +++ b/drivers/input/mouse/Kconfig @@ -295,7 +295,7 @@ config MOUSE_VSXXXAA config MOUSE_GPIO tristate "GPIO mouse" - depends on GENERIC_GPIO + depends on GPIOLIB select INPUT_POLLDEV help This driver simulates a mouse on GPIO lines of various CPUs (and some diff --git a/drivers/leds/Kconfig b/drivers/leds/Kconfig index d44806d41b44..ef992293598a 100644 --- a/drivers/leds/Kconfig +++ b/drivers/leds/Kconfig @@ -173,7 +173,7 @@ config LEDS_PCA9532_GPIO config LEDS_GPIO tristate "LED Support for GPIO connected LEDs" depends on LEDS_CLASS - depends on GENERIC_GPIO + depends on GPIOLIB help This option enables support for the LEDs connected to GPIO outputs. To be useful the particular board must have LEDs @@ -362,7 +362,7 @@ config LEDS_INTEL_SS4200 config LEDS_LT3593 tristate "LED driver for LT3593 controllers" depends on LEDS_CLASS - depends on GENERIC_GPIO + depends on GPIOLIB help This option enables support for LEDs driven by a Linear Technology LT3593 controller. This controller uses a special one-wire pulse @@ -431,7 +431,7 @@ config LEDS_ASIC3 config LEDS_RENESAS_TPU bool "LED support for Renesas TPU" - depends on LEDS_CLASS=y && HAVE_CLK && GENERIC_GPIO + depends on LEDS_CLASS=y && HAVE_CLK && GPIOLIB help This option enables build of the LED TPU platform driver, suitable to drive any TPU channel on newer Renesas SoCs. diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig index 4d8d90b4fe78..3bfc8f1da9fe 100644 --- a/drivers/md/Kconfig +++ b/drivers/md/Kconfig @@ -174,6 +174,8 @@ config MD_FAULTY In unsure, say N. +source "drivers/md/bcache/Kconfig" + config BLK_DEV_DM tristate "Device mapper support" ---help--- diff --git a/drivers/md/Makefile b/drivers/md/Makefile index 7ceeaefc0e95..1439fd4ad9b1 100644 --- a/drivers/md/Makefile +++ b/drivers/md/Makefile @@ -29,6 +29,7 @@ obj-$(CONFIG_MD_RAID10) += raid10.o obj-$(CONFIG_MD_RAID456) += raid456.o obj-$(CONFIG_MD_MULTIPATH) += multipath.o obj-$(CONFIG_MD_FAULTY) += faulty.o +obj-$(CONFIG_BCACHE) += bcache/ obj-$(CONFIG_BLK_DEV_MD) += md-mod.o obj-$(CONFIG_BLK_DEV_DM) += dm-mod.o obj-$(CONFIG_DM_BUFIO) += dm-bufio.o diff --git a/drivers/md/bcache/Kconfig b/drivers/md/bcache/Kconfig new file mode 100644 index 000000000000..05c220d05e23 --- /dev/null +++ b/drivers/md/bcache/Kconfig @@ -0,0 +1,42 @@ + +config BCACHE + tristate "Block device as cache" + select CLOSURES + ---help--- + Allows a block device to be used as cache for other devices; uses + a btree for indexing and the layout is optimized for SSDs. + + See Documentation/bcache.txt for details. + +config BCACHE_DEBUG + bool "Bcache debugging" + depends on BCACHE + ---help--- + Don't select this option unless you're a developer + + Enables extra debugging tools (primarily a fuzz tester) + +config BCACHE_EDEBUG + bool "Extended runtime checks" + depends on BCACHE + ---help--- + Don't select this option unless you're a developer + + Enables extra runtime checks which significantly affect performance + +config BCACHE_CLOSURES_DEBUG + bool "Debug closures" + depends on BCACHE + select DEBUG_FS + ---help--- + Keeps all active closures in a linked list and provides a debugfs + interface to list them, which makes it possible to see asynchronous + operations that get stuck. + +# cgroup code needs to be updated: +# +#config CGROUP_BCACHE +# bool "Cgroup controls for bcache" +# depends on BCACHE && BLK_CGROUP +# ---help--- +# TODO diff --git a/drivers/md/bcache/Makefile b/drivers/md/bcache/Makefile new file mode 100644 index 000000000000..0e9c82523be6 --- /dev/null +++ b/drivers/md/bcache/Makefile @@ -0,0 +1,7 @@ + +obj-$(CONFIG_BCACHE) += bcache.o + +bcache-y := alloc.o btree.o bset.o io.o journal.o writeback.o\ + movinggc.o request.o super.o sysfs.o debug.o util.o trace.o stats.o closure.o + +CFLAGS_request.o += -Iblock diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c new file mode 100644 index 000000000000..048f2947e08b --- /dev/null +++ b/drivers/md/bcache/alloc.c @@ -0,0 +1,599 @@ +/* + * Primary bucket allocation code + * + * Copyright 2012 Google, Inc. + * + * Allocation in bcache is done in terms of buckets: + * + * Each bucket has associated an 8 bit gen; this gen corresponds to the gen in + * btree pointers - they must match for the pointer to be considered valid. + * + * Thus (assuming a bucket has no dirty data or metadata in it) we can reuse a + * bucket simply by incrementing its gen. + * + * The gens (along with the priorities; it's really the gens are important but + * the code is named as if it's the priorities) are written in an arbitrary list + * of buckets on disk, with a pointer to them in the journal header. + * + * When we invalidate a bucket, we have to write its new gen to disk and wait + * for that write to complete before we use it - otherwise after a crash we + * could have pointers that appeared to be good but pointed to data that had + * been overwritten. + * + * Since the gens and priorities are all stored contiguously on disk, we can + * batch this up: We fill up the free_inc list with freshly invalidated buckets, + * call prio_write(), and when prio_write() finishes we pull buckets off the + * free_inc list and optionally discard them. + * + * free_inc isn't the only freelist - if it was, we'd often to sleep while + * priorities and gens were being written before we could allocate. c->free is a + * smaller freelist, and buckets on that list are always ready to be used. + * + * If we've got discards enabled, that happens when a bucket moves from the + * free_inc list to the free list. + * + * There is another freelist, because sometimes we have buckets that we know + * have nothing pointing into them - these we can reuse without waiting for + * priorities to be rewritten. These come from freed btree nodes and buckets + * that garbage collection discovered no longer had valid keys pointing into + * them (because they were overwritten). That's the unused list - buckets on the + * unused list move to the free list, optionally being discarded in the process. + * + * It's also important to ensure that gens don't wrap around - with respect to + * either the oldest gen in the btree or the gen on disk. This is quite + * difficult to do in practice, but we explicitly guard against it anyways - if + * a bucket is in danger of wrapping around we simply skip invalidating it that + * time around, and we garbage collect or rewrite the priorities sooner than we + * would have otherwise. + * + * bch_bucket_alloc() allocates a single bucket from a specific cache. + * + * bch_bucket_alloc_set() allocates one or more buckets from different caches + * out of a cache set. + * + * free_some_buckets() drives all the processes described above. It's called + * from bch_bucket_alloc() and a few other places that need to make sure free + * buckets are ready. + * + * invalidate_buckets_(lru|fifo)() find buckets that are available to be + * invalidated, and then invalidate them and stick them on the free_inc list - + * in either lru or fifo order. + */ + +#include "bcache.h" +#include "btree.h" + +#include + +#define MAX_IN_FLIGHT_DISCARDS 8U + +/* Bucket heap / gen */ + +uint8_t bch_inc_gen(struct cache *ca, struct bucket *b) +{ + uint8_t ret = ++b->gen; + + ca->set->need_gc = max(ca->set->need_gc, bucket_gc_gen(b)); + WARN_ON_ONCE(ca->set->need_gc > BUCKET_GC_GEN_MAX); + + if (CACHE_SYNC(&ca->set->sb)) { + ca->need_save_prio = max(ca->need_save_prio, + bucket_disk_gen(b)); + WARN_ON_ONCE(ca->need_save_prio > BUCKET_DISK_GEN_MAX); + } + + return ret; +} + +void bch_rescale_priorities(struct cache_set *c, int sectors) +{ + struct cache *ca; + struct bucket *b; + unsigned next = c->nbuckets * c->sb.bucket_size / 1024; + unsigned i; + int r; + + atomic_sub(sectors, &c->rescale); + + do { + r = atomic_read(&c->rescale); + + if (r >= 0) + return; + } while (atomic_cmpxchg(&c->rescale, r, r + next) != r); + + mutex_lock(&c->bucket_lock); + + c->min_prio = USHRT_MAX; + + for_each_cache(ca, c, i) + for_each_bucket(b, ca) + if (b->prio && + b->prio != BTREE_PRIO && + !atomic_read(&b->pin)) { + b->prio--; + c->min_prio = min(c->min_prio, b->prio); + } + + mutex_unlock(&c->bucket_lock); +} + +/* Discard/TRIM */ + +struct discard { + struct list_head list; + struct work_struct work; + struct cache *ca; + long bucket; + + struct bio bio; + struct bio_vec bv; +}; + +static void discard_finish(struct work_struct *w) +{ + struct discard *d = container_of(w, struct discard, work); + struct cache *ca = d->ca; + char buf[BDEVNAME_SIZE]; + + if (!test_bit(BIO_UPTODATE, &d->bio.bi_flags)) { + pr_notice("discard error on %s, disabling", + bdevname(ca->bdev, buf)); + d->ca->discard = 0; + } + + mutex_lock(&ca->set->bucket_lock); + + fifo_push(&ca->free, d->bucket); + list_add(&d->list, &ca->discards); + atomic_dec(&ca->discards_in_flight); + + mutex_unlock(&ca->set->bucket_lock); + + closure_wake_up(&ca->set->bucket_wait); + wake_up(&ca->set->alloc_wait); + + closure_put(&ca->set->cl); +} + +static void discard_endio(struct bio *bio, int error) +{ + struct discard *d = container_of(bio, struct discard, bio); + schedule_work(&d->work); +} + +static void do_discard(struct cache *ca, long bucket) +{ + struct discard *d = list_first_entry(&ca->discards, + struct discard, list); + + list_del(&d->list); + d->bucket = bucket; + + atomic_inc(&ca->discards_in_flight); + closure_get(&ca->set->cl); + + bio_init(&d->bio); + + d->bio.bi_sector = bucket_to_sector(ca->set, d->bucket); + d->bio.bi_bdev = ca->bdev; + d->bio.bi_rw = REQ_WRITE|REQ_DISCARD; + d->bio.bi_max_vecs = 1; + d->bio.bi_io_vec = d->bio.bi_inline_vecs; + d->bio.bi_size = bucket_bytes(ca); + d->bio.bi_end_io = discard_endio; + bio_set_prio(&d->bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0)); + + submit_bio(0, &d->bio); +} + +/* Allocation */ + +static inline bool can_inc_bucket_gen(struct bucket *b) +{ + return bucket_gc_gen(b) < BUCKET_GC_GEN_MAX && + bucket_disk_gen(b) < BUCKET_DISK_GEN_MAX; +} + +bool bch_bucket_add_unused(struct cache *ca, struct bucket *b) +{ + BUG_ON(GC_MARK(b) || GC_SECTORS_USED(b)); + + if (fifo_used(&ca->free) > ca->watermark[WATERMARK_MOVINGGC] && + CACHE_REPLACEMENT(&ca->sb) == CACHE_REPLACEMENT_FIFO) + return false; + + b->prio = 0; + + if (can_inc_bucket_gen(b) && + fifo_push(&ca->unused, b - ca->buckets)) { + atomic_inc(&b->pin); + return true; + } + + return false; +} + +static bool can_invalidate_bucket(struct cache *ca, struct bucket *b) +{ + return GC_MARK(b) == GC_MARK_RECLAIMABLE && + !atomic_read(&b->pin) && + can_inc_bucket_gen(b); +} + +static void invalidate_one_bucket(struct cache *ca, struct bucket *b) +{ + bch_inc_gen(ca, b); + b->prio = INITIAL_PRIO; + atomic_inc(&b->pin); + fifo_push(&ca->free_inc, b - ca->buckets); +} + +#define bucket_prio(b) \ + (((unsigned) (b->prio - ca->set->min_prio)) * GC_SECTORS_USED(b)) + +#define bucket_max_cmp(l, r) (bucket_prio(l) < bucket_prio(r)) +#define bucket_min_cmp(l, r) (bucket_prio(l) > bucket_prio(r)) + +static void invalidate_buckets_lru(struct cache *ca) +{ + struct bucket *b; + ssize_t i; + + ca->heap.used = 0; + + for_each_bucket(b, ca) { + /* + * If we fill up the unused list, if we then return before + * adding anything to the free_inc list we'll skip writing + * prios/gens and just go back to allocating from the unused + * list: + */ + if (fifo_full(&ca->unused)) + return; + + if (!can_invalidate_bucket(ca, b)) + continue; + + if (!GC_SECTORS_USED(b) && + bch_bucket_add_unused(ca, b)) + continue; + + if (!heap_full(&ca->heap)) + heap_add(&ca->heap, b, bucket_max_cmp); + else if (bucket_max_cmp(b, heap_peek(&ca->heap))) { + ca->heap.data[0] = b; + heap_sift(&ca->heap, 0, bucket_max_cmp); + } + } + + for (i = ca->heap.used / 2 - 1; i >= 0; --i) + heap_sift(&ca->heap, i, bucket_min_cmp); + + while (!fifo_full(&ca->free_inc)) { + if (!heap_pop(&ca->heap, b, bucket_min_cmp)) { + /* + * We don't want to be calling invalidate_buckets() + * multiple times when it can't do anything + */ + ca->invalidate_needs_gc = 1; + bch_queue_gc(ca->set); + return; + } + + invalidate_one_bucket(ca, b); + } +} + +static void invalidate_buckets_fifo(struct cache *ca) +{ + struct bucket *b; + size_t checked = 0; + + while (!fifo_full(&ca->free_inc)) { + if (ca->fifo_last_bucket < ca->sb.first_bucket || + ca->fifo_last_bucket >= ca->sb.nbuckets) + ca->fifo_last_bucket = ca->sb.first_bucket; + + b = ca->buckets + ca->fifo_last_bucket++; + + if (can_invalidate_bucket(ca, b)) + invalidate_one_bucket(ca, b); + + if (++checked >= ca->sb.nbuckets) { + ca->invalidate_needs_gc = 1; + bch_queue_gc(ca->set); + return; + } + } +} + +static void invalidate_buckets_random(struct cache *ca) +{ + struct bucket *b; + size_t checked = 0; + + while (!fifo_full(&ca->free_inc)) { + size_t n; + get_random_bytes(&n, sizeof(n)); + + n %= (size_t) (ca->sb.nbuckets - ca->sb.first_bucket); + n += ca->sb.first_bucket; + + b = ca->buckets + n; + + if (can_invalidate_bucket(ca, b)) + invalidate_one_bucket(ca, b); + + if (++checked >= ca->sb.nbuckets / 2) { + ca->invalidate_needs_gc = 1; + bch_queue_gc(ca->set); + return; + } + } +} + +static void invalidate_buckets(struct cache *ca) +{ + if (ca->invalidate_needs_gc) + return; + + switch (CACHE_REPLACEMENT(&ca->sb)) { + case CACHE_REPLACEMENT_LRU: + invalidate_buckets_lru(ca); + break; + case CACHE_REPLACEMENT_FIFO: + invalidate_buckets_fifo(ca); + break; + case CACHE_REPLACEMENT_RANDOM: + invalidate_buckets_random(ca); + break; + } + + pr_debug("free %zu/%zu free_inc %zu/%zu unused %zu/%zu", + fifo_used(&ca->free), ca->free.size, + fifo_used(&ca->free_inc), ca->free_inc.size, + fifo_used(&ca->unused), ca->unused.size); +} + +#define allocator_wait(ca, cond) \ +do { \ + DEFINE_WAIT(__wait); \ + \ + while (1) { \ + prepare_to_wait(&ca->set->alloc_wait, \ + &__wait, TASK_INTERRUPTIBLE); \ + if (cond) \ + break; \ + \ + mutex_unlock(&(ca)->set->bucket_lock); \ + if (test_bit(CACHE_SET_STOPPING_2, &ca->set->flags)) { \ + finish_wait(&ca->set->alloc_wait, &__wait); \ + closure_return(cl); \ + } \ + \ + schedule(); \ + mutex_lock(&(ca)->set->bucket_lock); \ + } \ + \ + finish_wait(&ca->set->alloc_wait, &__wait); \ +} while (0) + +void bch_allocator_thread(struct closure *cl) +{ + struct cache *ca = container_of(cl, struct cache, alloc); + + mutex_lock(&ca->set->bucket_lock); + + while (1) { + /* + * First, we pull buckets off of the unused and free_inc lists, + * possibly issue discards to them, then we add the bucket to + * the free list: + */ + while (1) { + long bucket; + + if ((!atomic_read(&ca->set->prio_blocked) || + !CACHE_SYNC(&ca->set->sb)) && + !fifo_empty(&ca->unused)) + fifo_pop(&ca->unused, bucket); + else if (!fifo_empty(&ca->free_inc)) + fifo_pop(&ca->free_inc, bucket); + else + break; + + allocator_wait(ca, (int) fifo_free(&ca->free) > + atomic_read(&ca->discards_in_flight)); + + if (ca->discard) { + allocator_wait(ca, !list_empty(&ca->discards)); + do_discard(ca, bucket); + } else { + fifo_push(&ca->free, bucket); + closure_wake_up(&ca->set->bucket_wait); + } + } + + /* + * We've run out of free buckets, we need to find some buckets + * we can invalidate. First, invalidate them in memory and add + * them to the free_inc list: + */ + + allocator_wait(ca, ca->set->gc_mark_valid && + (ca->need_save_prio > 64 || + !ca->invalidate_needs_gc)); + invalidate_buckets(ca); + + /* + * Now, we write their new gens to disk so we can start writing + * new stuff to them: + */ + allocator_wait(ca, !atomic_read(&ca->set->prio_blocked)); + if (CACHE_SYNC(&ca->set->sb) && + (!fifo_empty(&ca->free_inc) || + ca->need_save_prio > 64)) + bch_prio_write(ca); + } +} + +long bch_bucket_alloc(struct cache *ca, unsigned watermark, struct closure *cl) +{ + long r = -1; +again: + wake_up(&ca->set->alloc_wait); + + if (fifo_used(&ca->free) > ca->watermark[watermark] && + fifo_pop(&ca->free, r)) { + struct bucket *b = ca->buckets + r; +#ifdef CONFIG_BCACHE_EDEBUG + size_t iter; + long i; + + for (iter = 0; iter < prio_buckets(ca) * 2; iter++) + BUG_ON(ca->prio_buckets[iter] == (uint64_t) r); + + fifo_for_each(i, &ca->free, iter) + BUG_ON(i == r); + fifo_for_each(i, &ca->free_inc, iter) + BUG_ON(i == r); + fifo_for_each(i, &ca->unused, iter) + BUG_ON(i == r); +#endif + BUG_ON(atomic_read(&b->pin) != 1); + + SET_GC_SECTORS_USED(b, ca->sb.bucket_size); + + if (watermark <= WATERMARK_METADATA) { + SET_GC_MARK(b, GC_MARK_METADATA); + b->prio = BTREE_PRIO; + } else { + SET_GC_MARK(b, GC_MARK_RECLAIMABLE); + b->prio = INITIAL_PRIO; + } + + return r; + } + + pr_debug("alloc failure: blocked %i free %zu free_inc %zu unused %zu", + atomic_read(&ca->set->prio_blocked), fifo_used(&ca->free), + fifo_used(&ca->free_inc), fifo_used(&ca->unused)); + + if (cl) { + closure_wait(&ca->set->bucket_wait, cl); + + if (closure_blocking(cl)) { + mutex_unlock(&ca->set->bucket_lock); + closure_sync(cl); + mutex_lock(&ca->set->bucket_lock); + goto again; + } + } + + return -1; +} + +void bch_bucket_free(struct cache_set *c, struct bkey *k) +{ + unsigned i; + + for (i = 0; i < KEY_PTRS(k); i++) { + struct bucket *b = PTR_BUCKET(c, k, i); + + SET_GC_MARK(b, GC_MARK_RECLAIMABLE); + SET_GC_SECTORS_USED(b, 0); + bch_bucket_add_unused(PTR_CACHE(c, k, i), b); + } +} + +int __bch_bucket_alloc_set(struct cache_set *c, unsigned watermark, + struct bkey *k, int n, struct closure *cl) +{ + int i; + + lockdep_assert_held(&c->bucket_lock); + BUG_ON(!n || n > c->caches_loaded || n > 8); + + bkey_init(k); + + /* sort by free space/prio of oldest data in caches */ + + for (i = 0; i < n; i++) { + struct cache *ca = c->cache_by_alloc[i]; + long b = bch_bucket_alloc(ca, watermark, cl); + + if (b == -1) + goto err; + + k->ptr[i] = PTR(ca->buckets[b].gen, + bucket_to_sector(c, b), + ca->sb.nr_this_dev); + + SET_KEY_PTRS(k, i + 1); + } + + return 0; +err: + bch_bucket_free(c, k); + __bkey_put(c, k); + return -1; +} + +int bch_bucket_alloc_set(struct cache_set *c, unsigned watermark, + struct bkey *k, int n, struct closure *cl) +{ + int ret; + mutex_lock(&c->bucket_lock); + ret = __bch_bucket_alloc_set(c, watermark, k, n, cl); + mutex_unlock(&c->bucket_lock); + return ret; +} + +/* Init */ + +void bch_cache_allocator_exit(struct cache *ca) +{ + struct discard *d; + + while (!list_empty(&ca->discards)) { + d = list_first_entry(&ca->discards, struct discard, list); + cancel_work_sync(&d->work); + list_del(&d->list); + kfree(d); + } +} + +int bch_cache_allocator_init(struct cache *ca) +{ + unsigned i; + + /* + * Reserve: + * Prio/gen writes first + * Then 8 for btree allocations + * Then half for the moving garbage collector + */ + + ca->watermark[WATERMARK_PRIO] = 0; + + ca->watermark[WATERMARK_METADATA] = prio_buckets(ca); + + ca->watermark[WATERMARK_MOVINGGC] = 8 + + ca->watermark[WATERMARK_METADATA]; + + ca->watermark[WATERMARK_NONE] = ca->free.size / 2 + + ca->watermark[WATERMARK_MOVINGGC]; + + for (i = 0; i < MAX_IN_FLIGHT_DISCARDS; i++) { + struct discard *d = kzalloc(sizeof(*d), GFP_KERNEL); + if (!d) + return -ENOMEM; + + d->ca = ca; + INIT_WORK(&d->work, discard_finish); + list_add(&d->list, &ca->discards); + } + + return 0; +} diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h new file mode 100644 index 000000000000..340146d7c17f --- /dev/null +++ b/drivers/md/bcache/bcache.h @@ -0,0 +1,1259 @@ +#ifndef _BCACHE_H +#define _BCACHE_H + +/* + * SOME HIGH LEVEL CODE DOCUMENTATION: + * + * Bcache mostly works with cache sets, cache devices, and backing devices. + * + * Support for multiple cache devices hasn't quite been finished off yet, but + * it's about 95% plumbed through. A cache set and its cache devices is sort of + * like a md raid array and its component devices. Most of the code doesn't care + * about individual cache devices, the main abstraction is the cache set. + * + * Multiple cache devices is intended to give us the ability to mirror dirty + * cached data and metadata, without mirroring clean cached data. + * + * Backing devices are different, in that they have a lifetime independent of a + * cache set. When you register a newly formatted backing device it'll come up + * in passthrough mode, and then you can attach and detach a backing device from + * a cache set at runtime - while it's mounted and in use. Detaching implicitly + * invalidates any cached data for that backing device. + * + * A cache set can have multiple (many) backing devices attached to it. + * + * There's also flash only volumes - this is the reason for the distinction + * between struct cached_dev and struct bcache_device. A flash only volume + * works much like a bcache device that has a backing device, except the + * "cached" data is always dirty. The end result is that we get thin + * provisioning with very little additional code. + * + * Flash only volumes work but they're not production ready because the moving + * garbage collector needs more work. More on that later. + * + * BUCKETS/ALLOCATION: + * + * Bcache is primarily designed for caching, which means that in normal + * operation all of our available space will be allocated. Thus, we need an + * efficient way of deleting things from the cache so we can write new things to + * it. + * + * To do this, we first divide the cache device up into buckets. A bucket is the + * unit of allocation; they're typically around 1 mb - anywhere from 128k to 2M+ + * works efficiently. + * + * Each bucket has a 16 bit priority, and an 8 bit generation associated with + * it. The gens and priorities for all the buckets are stored contiguously and + * packed on disk (in a linked list of buckets - aside from the superblock, all + * of bcache's metadata is stored in buckets). + * + * The priority is used to implement an LRU. We reset a bucket's priority when + * we allocate it or on cache it, and every so often we decrement the priority + * of each bucket. It could be used to implement something more sophisticated, + * if anyone ever gets around to it. + * + * The generation is used for invalidating buckets. Each pointer also has an 8 + * bit generation embedded in it; for a pointer to be considered valid, its gen + * must match the gen of the bucket it points into. Thus, to reuse a bucket all + * we have to do is increment its gen (and write its new gen to disk; we batch + * this up). + * + * Bcache is entirely COW - we never write twice to a bucket, even buckets that + * contain metadata (including btree nodes). + * + * THE BTREE: + * + * Bcache is in large part design around the btree. + * + * At a high level, the btree is just an index of key -> ptr tuples. + * + * Keys represent extents, and thus have a size field. Keys also have a variable + * number of pointers attached to them (potentially zero, which is handy for + * invalidating the cache). + * + * The key itself is an inode:offset pair. The inode number corresponds to a + * backing device or a flash only volume. The offset is the ending offset of the + * extent within the inode - not the starting offset; this makes lookups + * slightly more convenient. + * + * Pointers contain the cache device id, the offset on that device, and an 8 bit + * generation number. More on the gen later. + * + * Index lookups are not fully abstracted - cache lookups in particular are + * still somewhat mixed in with the btree code, but things are headed in that + * direction. + * + * Updates are fairly well abstracted, though. There are two different ways of + * updating the btree; insert and replace. + * + * BTREE_INSERT will just take a list of keys and insert them into the btree - + * overwriting (possibly only partially) any extents they overlap with. This is + * used to update the index after a write. + * + * BTREE_REPLACE is really cmpxchg(); it inserts a key into the btree iff it is + * overwriting a key that matches another given key. This is used for inserting + * data into the cache after a cache miss, and for background writeback, and for + * the moving garbage collector. + * + * There is no "delete" operation; deleting things from the index is + * accomplished by either by invalidating pointers (by incrementing a bucket's + * gen) or by inserting a key with 0 pointers - which will overwrite anything + * previously present at that location in the index. + * + * This means that there are always stale/invalid keys in the btree. They're + * filtered out by the code that iterates through a btree node, and removed when + * a btree node is rewritten. + * + * BTREE NODES: + * + * Our unit of allocation is a bucket, and we we can't arbitrarily allocate and + * free smaller than a bucket - so, that's how big our btree nodes are. + * + * (If buckets are really big we'll only use part of the bucket for a btree node + * - no less than 1/4th - but a bucket still contains no more than a single + * btree node. I'd actually like to change this, but for now we rely on the + * bucket's gen for deleting btree nodes when we rewrite/split a node.) + * + * Anyways, btree nodes are big - big enough to be inefficient with a textbook + * btree implementation. + * + * The way this is solved is that btree nodes are internally log structured; we + * can append new keys to an existing btree node without rewriting it. This + * means each set of keys we write is sorted, but the node is not. + * + * We maintain this log structure in memory - keeping 1Mb of keys sorted would + * be expensive, and we have to distinguish between the keys we have written and + * the keys we haven't. So to do a lookup in a btree node, we have to search + * each sorted set. But we do merge written sets together lazily, so the cost of + * these extra searches is quite low (normally most of the keys in a btree node + * will be in one big set, and then there'll be one or two sets that are much + * smaller). + * + * This log structure makes bcache's btree more of a hybrid between a + * conventional btree and a compacting data structure, with some of the + * advantages of both. + * + * GARBAGE COLLECTION: + * + * We can't just invalidate any bucket - it might contain dirty data or + * metadata. If it once contained dirty data, other writes might overwrite it + * later, leaving no valid pointers into that bucket in the index. + * + * Thus, the primary purpose of garbage collection is to find buckets to reuse. + * It also counts how much valid data it each bucket currently contains, so that + * allocation can reuse buckets sooner when they've been mostly overwritten. + * + * It also does some things that are really internal to the btree + * implementation. If a btree node contains pointers that are stale by more than + * some threshold, it rewrites the btree node to avoid the bucket's generation + * wrapping around. It also merges adjacent btree nodes if they're empty enough. + * + * THE JOURNAL: + * + * Bcache's journal is not necessary for consistency; we always strictly + * order metadata writes so that the btree and everything else is consistent on + * disk in the event of an unclean shutdown, and in fact bcache had writeback + * caching (with recovery from unclean shutdown) before journalling was + * implemented. + * + * Rather, the journal is purely a performance optimization; we can't complete a + * write until we've updated the index on disk, otherwise the cache would be + * inconsistent in the event of an unclean shutdown. This means that without the + * journal, on random write workloads we constantly have to update all the leaf + * nodes in the btree, and those writes will be mostly empty (appending at most + * a few keys each) - highly inefficient in terms of amount of metadata writes, + * and it puts more strain on the various btree resorting/compacting code. + * + * The journal is just a log of keys we've inserted; on startup we just reinsert + * all the keys in the open journal entries. That means that when we're updating + * a node in the btree, we can wait until a 4k block of keys fills up before + * writing them out. + * + * For simplicity, we only journal updates to leaf nodes; updates to parent + * nodes are rare enough (since our leaf nodes are huge) that it wasn't worth + * the complexity to deal with journalling them (in particular, journal replay) + * - updates to non leaf nodes just happen synchronously (see btree_split()). + */ + +#define pr_fmt(fmt) "bcache: %s() " fmt "\n", __func__ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "util.h" +#include "closure.h" + +struct bucket { + atomic_t pin; + uint16_t prio; + uint8_t gen; + uint8_t disk_gen; + uint8_t last_gc; /* Most out of date gen in the btree */ + uint8_t gc_gen; + uint16_t gc_mark; +}; + +/* + * I'd use bitfields for these, but I don't trust the compiler not to screw me + * as multiple threads touch struct bucket without locking + */ + +BITMASK(GC_MARK, struct bucket, gc_mark, 0, 2); +#define GC_MARK_RECLAIMABLE 0 +#define GC_MARK_DIRTY 1 +#define GC_MARK_METADATA 2 +BITMASK(GC_SECTORS_USED, struct bucket, gc_mark, 2, 14); + +struct bkey { + uint64_t high; + uint64_t low; + uint64_t ptr[]; +}; + +/* Enough for a key with 6 pointers */ +#define BKEY_PAD 8 + +#define BKEY_PADDED(key) \ + union { struct bkey key; uint64_t key ## _pad[BKEY_PAD]; } + +/* Version 0: Cache device + * Version 1: Backing device + * Version 2: Seed pointer into btree node checksum + * Version 3: Cache device with new UUID format + * Version 4: Backing device with data offset + */ +#define BCACHE_SB_VERSION_CDEV 0 +#define BCACHE_SB_VERSION_BDEV 1 +#define BCACHE_SB_VERSION_CDEV_WITH_UUID 3 +#define BCACHE_SB_VERSION_BDEV_WITH_OFFSET 4 +#define BCACHE_SB_MAX_VERSION 4 + +#define SB_SECTOR 8 +#define SB_SIZE 4096 +#define SB_LABEL_SIZE 32 +#define SB_JOURNAL_BUCKETS 256U +/* SB_JOURNAL_BUCKETS must be divisible by BITS_PER_LONG */ +#define MAX_CACHES_PER_SET 8 + +#define BDEV_DATA_START_DEFAULT 16 /* sectors */ + +struct cache_sb { + uint64_t csum; + uint64_t offset; /* sector where this sb was written */ + uint64_t version; + + uint8_t magic[16]; + + uint8_t uuid[16]; + union { + uint8_t set_uuid[16]; + uint64_t set_magic; + }; + uint8_t label[SB_LABEL_SIZE]; + + uint64_t flags; + uint64_t seq; + uint64_t pad[8]; + + union { + struct { + /* Cache devices */ + uint64_t nbuckets; /* device size */ + + uint16_t block_size; /* sectors */ + uint16_t bucket_size; /* sectors */ + + uint16_t nr_in_set; + uint16_t nr_this_dev; + }; + struct { + /* Backing devices */ + uint64_t data_offset; + + /* + * block_size from the cache device section is still used by + * backing devices, so don't add anything here until we fix + * things to not need it for backing devices anymore + */ + }; + }; + + uint32_t last_mount; /* time_t */ + + uint16_t first_bucket; + union { + uint16_t njournal_buckets; + uint16_t keys; + }; + uint64_t d[SB_JOURNAL_BUCKETS]; /* journal buckets */ +}; + +BITMASK(CACHE_SYNC, struct cache_sb, flags, 0, 1); +BITMASK(CACHE_DISCARD, struct cache_sb, flags, 1, 1); +BITMASK(CACHE_REPLACEMENT, struct cache_sb, flags, 2, 3); +#define CACHE_REPLACEMENT_LRU 0U +#define CACHE_REPLACEMENT_FIFO 1U +#define CACHE_REPLACEMENT_RANDOM 2U + +BITMASK(BDEV_CACHE_MODE, struct cache_sb, flags, 0, 4); +#define CACHE_MODE_WRITETHROUGH 0U +#define CACHE_MODE_WRITEBACK 1U +#define CACHE_MODE_WRITEAROUND 2U +#define CACHE_MODE_NONE 3U +BITMASK(BDEV_STATE, struct cache_sb, flags, 61, 2); +#define BDEV_STATE_NONE 0U +#define BDEV_STATE_CLEAN 1U +#define BDEV_STATE_DIRTY 2U +#define BDEV_STATE_STALE 3U + +/* Version 1: Seed pointer into btree node checksum + */ +#define BCACHE_BSET_VERSION 1 + +/* + * This is the on disk format for btree nodes - a btree node on disk is a list + * of these; within each set the keys are sorted + */ +struct bset { + uint64_t csum; + uint64_t magic; + uint64_t seq; + uint32_t version; + uint32_t keys; + + union { + struct bkey start[0]; + uint64_t d[0]; + }; +}; + +/* + * On disk format for priorities and gens - see super.c near prio_write() for + * more. + */ +struct prio_set { + uint64_t csum; + uint64_t magic; + uint64_t seq; + uint32_t version; + uint32_t pad; + + uint64_t next_bucket; + + struct bucket_disk { + uint16_t prio; + uint8_t gen; + } __attribute((packed)) data[]; +}; + +struct uuid_entry { + union { + struct { + uint8_t uuid[16]; + uint8_t label[32]; + uint32_t first_reg; + uint32_t last_reg; + uint32_t invalidated; + + uint32_t flags; + /* Size of flash only volumes */ + uint64_t sectors; + }; + + uint8_t pad[128]; + }; +}; + +BITMASK(UUID_FLASH_ONLY, struct uuid_entry, flags, 0, 1); + +#include "journal.h" +#include "stats.h" +struct search; +struct btree; +struct keybuf; + +struct keybuf_key { + struct rb_node node; + BKEY_PADDED(key); + void *private; +}; + +typedef bool (keybuf_pred_fn)(struct keybuf *, struct bkey *); + +struct keybuf { + keybuf_pred_fn *key_predicate; + + struct bkey last_scanned; + spinlock_t lock; + + /* + * Beginning and end of range in rb tree - so that we can skip taking + * lock and checking the rb tree when we need to check for overlapping + * keys. + */ + struct bkey start; + struct bkey end; + + struct rb_root keys; + +#define KEYBUF_NR 100 + DECLARE_ARRAY_ALLOCATOR(struct keybuf_key, freelist, KEYBUF_NR); +}; + +struct bio_split_pool { + struct bio_set *bio_split; + mempool_t *bio_split_hook; +}; + +struct bio_split_hook { + struct closure cl; + struct bio_split_pool *p; + struct bio *bio; + bio_end_io_t *bi_end_io; + void *bi_private; +}; + +struct bcache_device { + struct closure cl; + + struct kobject kobj; + + struct cache_set *c; + unsigned id; +#define BCACHEDEVNAME_SIZE 12 + char name[BCACHEDEVNAME_SIZE]; + + struct gendisk *disk; + + /* If nonzero, we're closing */ + atomic_t closing; + + /* If nonzero, we're detaching/unregistering from cache set */ + atomic_t detaching; + + atomic_long_t sectors_dirty; + unsigned long sectors_dirty_gc; + unsigned long sectors_dirty_last; + long sectors_dirty_derivative; + + mempool_t *unaligned_bvec; + struct bio_set *bio_split; + + unsigned data_csum:1; + + int (*cache_miss)(struct btree *, struct search *, + struct bio *, unsigned); + int (*ioctl) (struct bcache_device *, fmode_t, unsigned, unsigned long); + + struct bio_split_pool bio_split_hook; +}; + +struct io { + /* Used to track sequential IO so it can be skipped */ + struct hlist_node hash; + struct list_head lru; + + unsigned long jiffies; + unsigned sequential; + sector_t last; +}; + +struct cached_dev { + struct list_head list; + struct bcache_device disk; + struct block_device *bdev; + + struct cache_sb sb; + struct bio sb_bio; + struct bio_vec sb_bv[1]; + struct closure_with_waitlist sb_write; + + /* Refcount on the cache set. Always nonzero when we're caching. */ + atomic_t count; + struct work_struct detach; + + /* + * Device might not be running if it's dirty and the cache set hasn't + * showed up yet. + */ + atomic_t running; + + /* + * Writes take a shared lock from start to finish; scanning for dirty + * data to refill the rb tree requires an exclusive lock. + */ + struct rw_semaphore writeback_lock; + + /* + * Nonzero, and writeback has a refcount (d->count), iff there is dirty + * data in the cache. Protected by writeback_lock; must have an + * shared lock to set and exclusive lock to clear. + */ + atomic_t has_dirty; + + struct ratelimit writeback_rate; + struct delayed_work writeback_rate_update; + + /* + * Internal to the writeback code, so read_dirty() can keep track of + * where it's at. + */ + sector_t last_read; + + /* Number of writeback bios in flight */ + atomic_t in_flight; + struct closure_with_timer writeback; + struct closure_waitlist writeback_wait; + + struct keybuf writeback_keys; + + /* For tracking sequential IO */ +#define RECENT_IO_BITS 7 +#define RECENT_IO (1 << RECENT_IO_BITS) + struct io io[RECENT_IO]; + struct hlist_head io_hash[RECENT_IO + 1]; + struct list_head io_lru; + spinlock_t io_lock; + + struct cache_accounting accounting; + + /* The rest of this all shows up in sysfs */ + unsigned sequential_cutoff; + unsigned readahead; + + unsigned sequential_merge:1; + unsigned verify:1; + + unsigned writeback_metadata:1; + unsigned writeback_running:1; + unsigned char writeback_percent; + unsigned writeback_delay; + + int writeback_rate_change; + int64_t writeback_rate_derivative; + uint64_t writeback_rate_target; + + unsigned writeback_rate_update_seconds; + unsigned writeback_rate_d_term; + unsigned writeback_rate_p_term_inverse; + unsigned writeback_rate_d_smooth; +}; + +enum alloc_watermarks { + WATERMARK_PRIO, + WATERMARK_METADATA, + WATERMARK_MOVINGGC, + WATERMARK_NONE, + WATERMARK_MAX +}; + +struct cache { + struct cache_set *set; + struct cache_sb sb; + struct bio sb_bio; + struct bio_vec sb_bv[1]; + + struct kobject kobj; + struct block_device *bdev; + + unsigned watermark[WATERMARK_MAX]; + + struct closure alloc; + struct workqueue_struct *alloc_workqueue; + + struct closure prio; + struct prio_set *disk_buckets; + + /* + * When allocating new buckets, prio_write() gets first dibs - since we + * may not be allocate at all without writing priorities and gens. + * prio_buckets[] contains the last buckets we wrote priorities to (so + * gc can mark them as metadata), prio_next[] contains the buckets + * allocated for the next prio write. + */ + uint64_t *prio_buckets; + uint64_t *prio_last_buckets; + + /* + * free: Buckets that are ready to be used + * + * free_inc: Incoming buckets - these are buckets that currently have + * cached data in them, and we can't reuse them until after we write + * their new gen to disk. After prio_write() finishes writing the new + * gens/prios, they'll be moved to the free list (and possibly discarded + * in the process) + * + * unused: GC found nothing pointing into these buckets (possibly + * because all the data they contained was overwritten), so we only + * need to discard them before they can be moved to the free list. + */ + DECLARE_FIFO(long, free); + DECLARE_FIFO(long, free_inc); + DECLARE_FIFO(long, unused); + + size_t fifo_last_bucket; + + /* Allocation stuff: */ + struct bucket *buckets; + + DECLARE_HEAP(struct bucket *, heap); + + /* + * max(gen - disk_gen) for all buckets. When it gets too big we have to + * call prio_write() to keep gens from wrapping. + */ + uint8_t need_save_prio; + unsigned gc_move_threshold; + + /* + * If nonzero, we know we aren't going to find any buckets to invalidate + * until a gc finishes - otherwise we could pointlessly burn a ton of + * cpu + */ + unsigned invalidate_needs_gc:1; + + bool discard; /* Get rid of? */ + + /* + * We preallocate structs for issuing discards to buckets, and keep them + * on this list when they're not in use; do_discard() issues discards + * whenever there's work to do and is called by free_some_buckets() and + * when a discard finishes. + */ + atomic_t discards_in_flight; + struct list_head discards; + + struct journal_device journal; + + /* The rest of this all shows up in sysfs */ +#define IO_ERROR_SHIFT 20 + atomic_t io_errors; + atomic_t io_count; + + atomic_long_t meta_sectors_written; + atomic_long_t btree_sectors_written; + atomic_long_t sectors_written; + + struct bio_split_pool bio_split_hook; +}; + +struct gc_stat { + size_t nodes; + size_t key_bytes; + + size_t nkeys; + uint64_t data; /* sectors */ + uint64_t dirty; /* sectors */ + unsigned in_use; /* percent */ +}; + +/* + * Flag bits, for how the cache set is shutting down, and what phase it's at: + * + * CACHE_SET_UNREGISTERING means we're not just shutting down, we're detaching + * all the backing devices first (their cached data gets invalidated, and they + * won't automatically reattach). + * + * CACHE_SET_STOPPING always gets set first when we're closing down a cache set; + * we'll continue to run normally for awhile with CACHE_SET_STOPPING set (i.e. + * flushing dirty data). + * + * CACHE_SET_STOPPING_2 gets set at the last phase, when it's time to shut down + * the allocation thread. + */ +#define CACHE_SET_UNREGISTERING 0 +#define CACHE_SET_STOPPING 1 +#define CACHE_SET_STOPPING_2 2 + +struct cache_set { + struct closure cl; + + struct list_head list; + struct kobject kobj; + struct kobject internal; + struct dentry *debug; + struct cache_accounting accounting; + + unsigned long flags; + + struct cache_sb sb; + + struct cache *cache[MAX_CACHES_PER_SET]; + struct cache *cache_by_alloc[MAX_CACHES_PER_SET]; + int caches_loaded; + + struct bcache_device **devices; + struct list_head cached_devs; + uint64_t cached_dev_sectors; + struct closure caching; + + struct closure_with_waitlist sb_write; + + mempool_t *search; + mempool_t *bio_meta; + struct bio_set *bio_split; + + /* For the btree cache */ + struct shrinker shrink; + + /* For the allocator itself */ + wait_queue_head_t alloc_wait; + + /* For the btree cache and anything allocation related */ + struct mutex bucket_lock; + + /* log2(bucket_size), in sectors */ + unsigned short bucket_bits; + + /* log2(block_size), in sectors */ + unsigned short block_bits; + + /* + * Default number of pages for a new btree node - may be less than a + * full bucket + */ + unsigned btree_pages; + + /* + * Lists of struct btrees; lru is the list for structs that have memory + * allocated for actual btree node, freed is for structs that do not. + * + * We never free a struct btree, except on shutdown - we just put it on + * the btree_cache_freed list and reuse it later. This simplifies the + * code, and it doesn't cost us much memory as the memory usage is + * dominated by buffers that hold the actual btree node data and those + * can be freed - and the number of struct btrees allocated is + * effectively bounded. + * + * btree_cache_freeable effectively is a small cache - we use it because + * high order page allocations can be rather expensive, and it's quite + * common to delete and allocate btree nodes in quick succession. It + * should never grow past ~2-3 nodes in practice. + */ + struct list_head btree_cache; + struct list_head btree_cache_freeable; + struct list_head btree_cache_freed; + + /* Number of elements in btree_cache + btree_cache_freeable lists */ + unsigned bucket_cache_used; + + /* + * If we need to allocate memory for a new btree node and that + * allocation fails, we can cannibalize another node in the btree cache + * to satisfy the allocation. However, only one thread can be doing this + * at a time, for obvious reasons - try_harder and try_wait are + * basically a lock for this that we can wait on asynchronously. The + * btree_root() macro releases the lock when it returns. + */ + struct closure *try_harder; + struct closure_waitlist try_wait; + uint64_t try_harder_start; + + /* + * When we free a btree node, we increment the gen of the bucket the + * node is in - but we can't rewrite the prios and gens until we + * finished whatever it is we were doing, otherwise after a crash the + * btree node would be freed but for say a split, we might not have the + * pointers to the new nodes inserted into the btree yet. + * + * This is a refcount that blocks prio_write() until the new keys are + * written. + */ + atomic_t prio_blocked; + struct closure_waitlist bucket_wait; + + /* + * For any bio we don't skip we subtract the number of sectors from + * rescale; when it hits 0 we rescale all the bucket priorities. + */ + atomic_t rescale; + /* + * When we invalidate buckets, we use both the priority and the amount + * of good data to determine which buckets to reuse first - to weight + * those together consistently we keep track of the smallest nonzero + * priority of any bucket. + */ + uint16_t min_prio; + + /* + * max(gen - gc_gen) for all buckets. When it gets too big we have to gc + * to keep gens from wrapping around. + */ + uint8_t need_gc; + struct gc_stat gc_stats; + size_t nbuckets; + + struct closure_with_waitlist gc; + /* Where in the btree gc currently is */ + struct bkey gc_done; + + /* + * The allocation code needs gc_mark in struct bucket to be correct, but + * it's not while a gc is in progress. Protected by bucket_lock. + */ + int gc_mark_valid; + + /* Counts how many sectors bio_insert has added to the cache */ + atomic_t sectors_to_gc; + + struct closure moving_gc; + struct closure_waitlist moving_gc_wait; + struct keybuf moving_gc_keys; + /* Number of moving GC bios in flight */ + atomic_t in_flight; + + struct btree *root; + +#ifdef CONFIG_BCACHE_DEBUG + struct btree *verify_data; + struct mutex verify_lock; +#endif + + unsigned nr_uuids; + struct uuid_entry *uuids; + BKEY_PADDED(uuid_bucket); + struct closure_with_waitlist uuid_write; + + /* + * A btree node on disk could have too many bsets for an iterator to fit + * on the stack - this is a single element mempool for btree_read_work() + */ + struct mutex fill_lock; + struct btree_iter *fill_iter; + + /* + * btree_sort() is a merge sort and requires temporary space - single + * element mempool + */ + struct mutex sort_lock; + struct bset *sort; + + /* List of buckets we're currently writing data to */ + struct list_head data_buckets; + spinlock_t data_bucket_lock; + + struct journal journal; + +#define CONGESTED_MAX 1024 + unsigned congested_last_us; + atomic_t congested; + + /* The rest of this all shows up in sysfs */ + unsigned congested_read_threshold_us; + unsigned congested_write_threshold_us; + + spinlock_t sort_time_lock; + struct time_stats sort_time; + struct time_stats btree_gc_time; + struct time_stats btree_split_time; + spinlock_t btree_read_time_lock; + struct time_stats btree_read_time; + struct time_stats try_harder_time; + + atomic_long_t cache_read_races; + atomic_long_t writeback_keys_done; + atomic_long_t writeback_keys_failed; + unsigned error_limit; + unsigned error_decay; + unsigned short journal_delay_ms; + unsigned verify:1; + unsigned key_merging_disabled:1; + unsigned gc_always_rewrite:1; + unsigned shrinker_disabled:1; + unsigned copy_gc_enabled:1; + +#define BUCKET_HASH_BITS 12 + struct hlist_head bucket_hash[1 << BUCKET_HASH_BITS]; +}; + +static inline bool key_merging_disabled(struct cache_set *c) +{ +#ifdef CONFIG_BCACHE_DEBUG + return c->key_merging_disabled; +#else + return 0; +#endif +} + +static inline bool SB_IS_BDEV(const struct cache_sb *sb) +{ + return sb->version == BCACHE_SB_VERSION_BDEV + || sb->version == BCACHE_SB_VERSION_BDEV_WITH_OFFSET; +} + +struct bbio { + unsigned submit_time_us; + union { + struct bkey key; + uint64_t _pad[3]; + /* + * We only need pad = 3 here because we only ever carry around a + * single pointer - i.e. the pointer we're doing io to/from. + */ + }; + struct bio bio; +}; + +static inline unsigned local_clock_us(void) +{ + return local_clock() >> 10; +} + +#define MAX_BSETS 4U + +#define BTREE_PRIO USHRT_MAX +#define INITIAL_PRIO 32768 + +#define btree_bytes(c) ((c)->btree_pages * PAGE_SIZE) +#define btree_blocks(b) \ + ((unsigned) (KEY_SIZE(&b->key) >> (b)->c->block_bits)) + +#define btree_default_blocks(c) \ + ((unsigned) ((PAGE_SECTORS * (c)->btree_pages) >> (c)->block_bits)) + +#define bucket_pages(c) ((c)->sb.bucket_size / PAGE_SECTORS) +#define bucket_bytes(c) ((c)->sb.bucket_size << 9) +#define block_bytes(c) ((c)->sb.block_size << 9) + +#define __set_bytes(i, k) (sizeof(*(i)) + (k) * sizeof(uint64_t)) +#define set_bytes(i) __set_bytes(i, i->keys) + +#define __set_blocks(i, k, c) DIV_ROUND_UP(__set_bytes(i, k), block_bytes(c)) +#define set_blocks(i, c) __set_blocks(i, (i)->keys, c) + +#define node(i, j) ((struct bkey *) ((i)->d + (j))) +#define end(i) node(i, (i)->keys) + +#define index(i, b) \ + ((size_t) (((void *) i - (void *) (b)->sets[0].data) / \ + block_bytes(b->c))) + +#define btree_data_space(b) (PAGE_SIZE << (b)->page_order) + +#define prios_per_bucket(c) \ + ((bucket_bytes(c) - sizeof(struct prio_set)) / \ + sizeof(struct bucket_disk)) +#define prio_buckets(c) \ + DIV_ROUND_UP((size_t) (c)->sb.nbuckets, prios_per_bucket(c)) + +#define JSET_MAGIC 0x245235c1a3625032ULL +#define PSET_MAGIC 0x6750e15f87337f91ULL +#define BSET_MAGIC 0x90135c78b99e07f5ULL + +#define jset_magic(c) ((c)->sb.set_magic ^ JSET_MAGIC) +#define pset_magic(c) ((c)->sb.set_magic ^ PSET_MAGIC) +#define bset_magic(c) ((c)->sb.set_magic ^ BSET_MAGIC) + +/* Bkey fields: all units are in sectors */ + +#define KEY_FIELD(name, field, offset, size) \ + BITMASK(name, struct bkey, field, offset, size) + +#define PTR_FIELD(name, offset, size) \ + static inline uint64_t name(const struct bkey *k, unsigned i) \ + { return (k->ptr[i] >> offset) & ~(((uint64_t) ~0) << size); } \ + \ + static inline void SET_##name(struct bkey *k, unsigned i, uint64_t v)\ + { \ + k->ptr[i] &= ~(~((uint64_t) ~0 << size) << offset); \ + k->ptr[i] |= v << offset; \ + } + +KEY_FIELD(KEY_PTRS, high, 60, 3) +KEY_FIELD(HEADER_SIZE, high, 58, 2) +KEY_FIELD(KEY_CSUM, high, 56, 2) +KEY_FIELD(KEY_PINNED, high, 55, 1) +KEY_FIELD(KEY_DIRTY, high, 36, 1) + +KEY_FIELD(KEY_SIZE, high, 20, 16) +KEY_FIELD(KEY_INODE, high, 0, 20) + +/* Next time I change the on disk format, KEY_OFFSET() won't be 64 bits */ + +static inline uint64_t KEY_OFFSET(const struct bkey *k) +{ + return k->low; +} + +static inline void SET_KEY_OFFSET(struct bkey *k, uint64_t v) +{ + k->low = v; +} + +PTR_FIELD(PTR_DEV, 51, 12) +PTR_FIELD(PTR_OFFSET, 8, 43) +PTR_FIELD(PTR_GEN, 0, 8) + +#define PTR_CHECK_DEV ((1 << 12) - 1) + +#define PTR(gen, offset, dev) \ + ((((uint64_t) dev) << 51) | ((uint64_t) offset) << 8 | gen) + +static inline size_t sector_to_bucket(struct cache_set *c, sector_t s) +{ + return s >> c->bucket_bits; +} + +static inline sector_t bucket_to_sector(struct cache_set *c, size_t b) +{ + return ((sector_t) b) << c->bucket_bits; +} + +static inline sector_t bucket_remainder(struct cache_set *c, sector_t s) +{ + return s & (c->sb.bucket_size - 1); +} + +static inline struct cache *PTR_CACHE(struct cache_set *c, + const struct bkey *k, + unsigned ptr) +{ + return c->cache[PTR_DEV(k, ptr)]; +} + +static inline size_t PTR_BUCKET_NR(struct cache_set *c, + const struct bkey *k, + unsigned ptr) +{ + return sector_to_bucket(c, PTR_OFFSET(k, ptr)); +} + +static inline struct bucket *PTR_BUCKET(struct cache_set *c, + const struct bkey *k, + unsigned ptr) +{ + return PTR_CACHE(c, k, ptr)->buckets + PTR_BUCKET_NR(c, k, ptr); +} + +/* Btree key macros */ + +/* + * The high bit being set is a relic from when we used it to do binary + * searches - it told you where a key started. It's not used anymore, + * and can probably be safely dropped. + */ +#define KEY(dev, sector, len) \ +((struct bkey) { \ + .high = (1ULL << 63) | ((uint64_t) (len) << 20) | (dev), \ + .low = (sector) \ +}) + +static inline void bkey_init(struct bkey *k) +{ + *k = KEY(0, 0, 0); +} + +#define KEY_START(k) (KEY_OFFSET(k) - KEY_SIZE(k)) +#define START_KEY(k) KEY(KEY_INODE(k), KEY_START(k), 0) +#define MAX_KEY KEY(~(~0 << 20), ((uint64_t) ~0) >> 1, 0) +#define ZERO_KEY KEY(0, 0, 0) + +/* + * This is used for various on disk data structures - cache_sb, prio_set, bset, + * jset: The checksum is _always_ the first 8 bytes of these structs + */ +#define csum_set(i) \ + bch_crc64(((void *) (i)) + sizeof(uint64_t), \ + ((void *) end(i)) - (((void *) (i)) + sizeof(uint64_t))) + +/* Error handling macros */ + +#define btree_bug(b, ...) \ +do { \ + if (bch_cache_set_error((b)->c, __VA_ARGS__)) \ + dump_stack(); \ +} while (0) + +#define cache_bug(c, ...) \ +do { \ + if (bch_cache_set_error(c, __VA_ARGS__)) \ + dump_stack(); \ +} while (0) + +#define btree_bug_on(cond, b, ...) \ +do { \ + if (cond) \ + btree_bug(b, __VA_ARGS__); \ +} while (0) + +#define cache_bug_on(cond, c, ...) \ +do { \ + if (cond) \ + cache_bug(c, __VA_ARGS__); \ +} while (0) + +#define cache_set_err_on(cond, c, ...) \ +do { \ + if (cond) \ + bch_cache_set_error(c, __VA_ARGS__); \ +} while (0) + +/* Looping macros */ + +#define for_each_cache(ca, cs, iter) \ + for (iter = 0; ca = cs->cache[iter], iter < (cs)->sb.nr_in_set; iter++) + +#define for_each_bucket(b, ca) \ + for (b = (ca)->buckets + (ca)->sb.first_bucket; \ + b < (ca)->buckets + (ca)->sb.nbuckets; b++) + +static inline void __bkey_put(struct cache_set *c, struct bkey *k) +{ + unsigned i; + + for (i = 0; i < KEY_PTRS(k); i++) + atomic_dec_bug(&PTR_BUCKET(c, k, i)->pin); +} + +/* Blktrace macros */ + +#define blktrace_msg(c, fmt, ...) \ +do { \ + struct request_queue *q = bdev_get_queue(c->bdev); \ + if (q) \ + blk_add_trace_msg(q, fmt, ##__VA_ARGS__); \ +} while (0) + +#define blktrace_msg_all(s, fmt, ...) \ +do { \ + struct cache *_c; \ + unsigned i; \ + for_each_cache(_c, (s), i) \ + blktrace_msg(_c, fmt, ##__VA_ARGS__); \ +} while (0) + +static inline void cached_dev_put(struct cached_dev *dc) +{ + if (atomic_dec_and_test(&dc->count)) + schedule_work(&dc->detach); +} + +static inline bool cached_dev_get(struct cached_dev *dc) +{ + if (!atomic_inc_not_zero(&dc->count)) + return false; + + /* Paired with the mb in cached_dev_attach */ + smp_mb__after_atomic_inc(); + return true; +} + +/* + * bucket_gc_gen() returns the difference between the bucket's current gen and + * the oldest gen of any pointer into that bucket in the btree (last_gc). + * + * bucket_disk_gen() returns the difference between the current gen and the gen + * on disk; they're both used to make sure gens don't wrap around. + */ + +static inline uint8_t bucket_gc_gen(struct bucket *b) +{ + return b->gen - b->last_gc; +} + +static inline uint8_t bucket_disk_gen(struct bucket *b) +{ + return b->gen - b->disk_gen; +} + +#define BUCKET_GC_GEN_MAX 96U +#define BUCKET_DISK_GEN_MAX 64U + +#define kobj_attribute_write(n, fn) \ + static struct kobj_attribute ksysfs_##n = __ATTR(n, S_IWUSR, NULL, fn) + +#define kobj_attribute_rw(n, show, store) \ + static struct kobj_attribute ksysfs_##n = \ + __ATTR(n, S_IWUSR|S_IRUSR, show, store) + +/* Forward declarations */ + +void bch_writeback_queue(struct cached_dev *); +void bch_writeback_add(struct cached_dev *, unsigned); + +void bch_count_io_errors(struct cache *, int, const char *); +void bch_bbio_count_io_errors(struct cache_set *, struct bio *, + int, const char *); +void bch_bbio_endio(struct cache_set *, struct bio *, int, const char *); +void bch_bbio_free(struct bio *, struct cache_set *); +struct bio *bch_bbio_alloc(struct cache_set *); + +struct bio *bch_bio_split(struct bio *, int, gfp_t, struct bio_set *); +void bch_generic_make_request(struct bio *, struct bio_split_pool *); +void __bch_submit_bbio(struct bio *, struct cache_set *); +void bch_submit_bbio(struct bio *, struct cache_set *, struct bkey *, unsigned); + +uint8_t bch_inc_gen(struct cache *, struct bucket *); +void bch_rescale_priorities(struct cache_set *, int); +bool bch_bucket_add_unused(struct cache *, struct bucket *); +void bch_allocator_thread(struct closure *); + +long bch_bucket_alloc(struct cache *, unsigned, struct closure *); +void bch_bucket_free(struct cache_set *, struct bkey *); + +int __bch_bucket_alloc_set(struct cache_set *, unsigned, + struct bkey *, int, struct closure *); +int bch_bucket_alloc_set(struct cache_set *, unsigned, + struct bkey *, int, struct closure *); + +__printf(2, 3) +bool bch_cache_set_error(struct cache_set *, const char *, ...); + +void bch_prio_write(struct cache *); +void bch_write_bdev_super(struct cached_dev *, struct closure *); + +extern struct workqueue_struct *bcache_wq, *bch_gc_wq; +extern const char * const bch_cache_modes[]; +extern struct mutex bch_register_lock; +extern struct list_head bch_cache_sets; + +extern struct kobj_type bch_cached_dev_ktype; +extern struct kobj_type bch_flash_dev_ktype; +extern struct kobj_type bch_cache_set_ktype; +extern struct kobj_type bch_cache_set_internal_ktype; +extern struct kobj_type bch_cache_ktype; + +void bch_cached_dev_release(struct kobject *); +void bch_flash_dev_release(struct kobject *); +void bch_cache_set_release(struct kobject *); +void bch_cache_release(struct kobject *); + +int bch_uuid_write(struct cache_set *); +void bcache_write_super(struct cache_set *); + +int bch_flash_dev_create(struct cache_set *c, uint64_t size); + +int bch_cached_dev_attach(struct cached_dev *, struct cache_set *); +void bch_cached_dev_detach(struct cached_dev *); +void bch_cached_dev_run(struct cached_dev *); +void bcache_device_stop(struct bcache_device *); + +void bch_cache_set_unregister(struct cache_set *); +void bch_cache_set_stop(struct cache_set *); + +struct cache_set *bch_cache_set_alloc(struct cache_sb *); +void bch_btree_cache_free(struct cache_set *); +int bch_btree_cache_alloc(struct cache_set *); +void bch_writeback_init_cached_dev(struct cached_dev *); +void bch_moving_init_cache_set(struct cache_set *); + +void bch_cache_allocator_exit(struct cache *ca); +int bch_cache_allocator_init(struct cache *ca); + +void bch_debug_exit(void); +int bch_debug_init(struct kobject *); +void bch_writeback_exit(void); +int bch_writeback_init(void); +void bch_request_exit(void); +int bch_request_init(void); +void bch_btree_exit(void); +int bch_btree_init(void); + +#endif /* _BCACHE_H */ diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c new file mode 100644 index 000000000000..cb4578a327b9 --- /dev/null +++ b/drivers/md/bcache/bset.c @@ -0,0 +1,1192 @@ +/* + * Code for working with individual keys, and sorted sets of keys with in a + * btree node + * + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "btree.h" +#include "debug.h" + +#include +#include + +/* Keylists */ + +void bch_keylist_copy(struct keylist *dest, struct keylist *src) +{ + *dest = *src; + + if (src->list == src->d) { + size_t n = (uint64_t *) src->top - src->d; + dest->top = (struct bkey *) &dest->d[n]; + dest->list = dest->d; + } +} + +int bch_keylist_realloc(struct keylist *l, int nptrs, struct cache_set *c) +{ + unsigned oldsize = (uint64_t *) l->top - l->list; + unsigned newsize = oldsize + 2 + nptrs; + uint64_t *new; + + /* The journalling code doesn't handle the case where the keys to insert + * is bigger than an empty write: If we just return -ENOMEM here, + * bio_insert() and bio_invalidate() will insert the keys created so far + * and finish the rest when the keylist is empty. + */ + if (newsize * sizeof(uint64_t) > block_bytes(c) - sizeof(struct jset)) + return -ENOMEM; + + newsize = roundup_pow_of_two(newsize); + + if (newsize <= KEYLIST_INLINE || + roundup_pow_of_two(oldsize) == newsize) + return 0; + + new = krealloc(l->list == l->d ? NULL : l->list, + sizeof(uint64_t) * newsize, GFP_NOIO); + + if (!new) + return -ENOMEM; + + if (l->list == l->d) + memcpy(new, l->list, sizeof(uint64_t) * KEYLIST_INLINE); + + l->list = new; + l->top = (struct bkey *) (&l->list[oldsize]); + + return 0; +} + +struct bkey *bch_keylist_pop(struct keylist *l) +{ + struct bkey *k = l->bottom; + + if (k == l->top) + return NULL; + + while (bkey_next(k) != l->top) + k = bkey_next(k); + + return l->top = k; +} + +/* Pointer validation */ + +bool __bch_ptr_invalid(struct cache_set *c, int level, const struct bkey *k) +{ + unsigned i; + + if (level && (!KEY_PTRS(k) || !KEY_SIZE(k) || KEY_DIRTY(k))) + goto bad; + + if (!level && KEY_SIZE(k) > KEY_OFFSET(k)) + goto bad; + + if (!KEY_SIZE(k)) + return true; + + for (i = 0; i < KEY_PTRS(k); i++) + if (ptr_available(c, k, i)) { + struct cache *ca = PTR_CACHE(c, k, i); + size_t bucket = PTR_BUCKET_NR(c, k, i); + size_t r = bucket_remainder(c, PTR_OFFSET(k, i)); + + if (KEY_SIZE(k) + r > c->sb.bucket_size || + bucket < ca->sb.first_bucket || + bucket >= ca->sb.nbuckets) + goto bad; + } + + return false; +bad: + cache_bug(c, "spotted bad key %s: %s", pkey(k), bch_ptr_status(c, k)); + return true; +} + +bool bch_ptr_bad(struct btree *b, const struct bkey *k) +{ + struct bucket *g; + unsigned i, stale; + + if (!bkey_cmp(k, &ZERO_KEY) || + !KEY_PTRS(k) || + bch_ptr_invalid(b, k)) + return true; + + if (KEY_PTRS(k) && PTR_DEV(k, 0) == PTR_CHECK_DEV) + return true; + + for (i = 0; i < KEY_PTRS(k); i++) + if (ptr_available(b->c, k, i)) { + g = PTR_BUCKET(b->c, k, i); + stale = ptr_stale(b->c, k, i); + + btree_bug_on(stale > 96, b, + "key too stale: %i, need_gc %u", + stale, b->c->need_gc); + + btree_bug_on(stale && KEY_DIRTY(k) && KEY_SIZE(k), + b, "stale dirty pointer"); + + if (stale) + return true; + +#ifdef CONFIG_BCACHE_EDEBUG + if (!mutex_trylock(&b->c->bucket_lock)) + continue; + + if (b->level) { + if (KEY_DIRTY(k) || + g->prio != BTREE_PRIO || + (b->c->gc_mark_valid && + GC_MARK(g) != GC_MARK_METADATA)) + goto bug; + + } else { + if (g->prio == BTREE_PRIO) + goto bug; + + if (KEY_DIRTY(k) && + b->c->gc_mark_valid && + GC_MARK(g) != GC_MARK_DIRTY) + goto bug; + } + mutex_unlock(&b->c->bucket_lock); +#endif + } + + return false; +#ifdef CONFIG_BCACHE_EDEBUG +bug: + mutex_unlock(&b->c->bucket_lock); + btree_bug(b, +"inconsistent pointer %s: bucket %zu pin %i prio %i gen %i last_gc %i mark %llu gc_gen %i", + pkey(k), PTR_BUCKET_NR(b->c, k, i), atomic_read(&g->pin), + g->prio, g->gen, g->last_gc, GC_MARK(g), g->gc_gen); + return true; +#endif +} + +/* Key/pointer manipulation */ + +void bch_bkey_copy_single_ptr(struct bkey *dest, const struct bkey *src, + unsigned i) +{ + BUG_ON(i > KEY_PTRS(src)); + + /* Only copy the header, key, and one pointer. */ + memcpy(dest, src, 2 * sizeof(uint64_t)); + dest->ptr[0] = src->ptr[i]; + SET_KEY_PTRS(dest, 1); + /* We didn't copy the checksum so clear that bit. */ + SET_KEY_CSUM(dest, 0); +} + +bool __bch_cut_front(const struct bkey *where, struct bkey *k) +{ + unsigned i, len = 0; + + if (bkey_cmp(where, &START_KEY(k)) <= 0) + return false; + + if (bkey_cmp(where, k) < 0) + len = KEY_OFFSET(k) - KEY_OFFSET(where); + else + bkey_copy_key(k, where); + + for (i = 0; i < KEY_PTRS(k); i++) + SET_PTR_OFFSET(k, i, PTR_OFFSET(k, i) + KEY_SIZE(k) - len); + + BUG_ON(len > KEY_SIZE(k)); + SET_KEY_SIZE(k, len); + return true; +} + +bool __bch_cut_back(const struct bkey *where, struct bkey *k) +{ + unsigned len = 0; + + if (bkey_cmp(where, k) >= 0) + return false; + + BUG_ON(KEY_INODE(where) != KEY_INODE(k)); + + if (bkey_cmp(where, &START_KEY(k)) > 0) + len = KEY_OFFSET(where) - KEY_START(k); + + bkey_copy_key(k, where); + + BUG_ON(len > KEY_SIZE(k)); + SET_KEY_SIZE(k, len); + return true; +} + +static uint64_t merge_chksums(struct bkey *l, struct bkey *r) +{ + return (l->ptr[KEY_PTRS(l)] + r->ptr[KEY_PTRS(r)]) & + ~((uint64_t)1 << 63); +} + +/* Tries to merge l and r: l should be lower than r + * Returns true if we were able to merge. If we did merge, l will be the merged + * key, r will be untouched. + */ +bool bch_bkey_try_merge(struct btree *b, struct bkey *l, struct bkey *r) +{ + unsigned i; + + if (key_merging_disabled(b->c)) + return false; + + if (KEY_PTRS(l) != KEY_PTRS(r) || + KEY_DIRTY(l) != KEY_DIRTY(r) || + bkey_cmp(l, &START_KEY(r))) + return false; + + for (i = 0; i < KEY_PTRS(l); i++) + if (l->ptr[i] + PTR(0, KEY_SIZE(l), 0) != r->ptr[i] || + PTR_BUCKET_NR(b->c, l, i) != PTR_BUCKET_NR(b->c, r, i)) + return false; + + /* Keys with no pointers aren't restricted to one bucket and could + * overflow KEY_SIZE + */ + if (KEY_SIZE(l) + KEY_SIZE(r) > USHRT_MAX) { + SET_KEY_OFFSET(l, KEY_OFFSET(l) + USHRT_MAX - KEY_SIZE(l)); + SET_KEY_SIZE(l, USHRT_MAX); + + bch_cut_front(l, r); + return false; + } + + if (KEY_CSUM(l)) { + if (KEY_CSUM(r)) + l->ptr[KEY_PTRS(l)] = merge_chksums(l, r); + else + SET_KEY_CSUM(l, 0); + } + + SET_KEY_OFFSET(l, KEY_OFFSET(l) + KEY_SIZE(r)); + SET_KEY_SIZE(l, KEY_SIZE(l) + KEY_SIZE(r)); + + return true; +} + +/* Binary tree stuff for auxiliary search trees */ + +static unsigned inorder_next(unsigned j, unsigned size) +{ + if (j * 2 + 1 < size) { + j = j * 2 + 1; + + while (j * 2 < size) + j *= 2; + } else + j >>= ffz(j) + 1; + + return j; +} + +static unsigned inorder_prev(unsigned j, unsigned size) +{ + if (j * 2 < size) { + j = j * 2; + + while (j * 2 + 1 < size) + j = j * 2 + 1; + } else + j >>= ffs(j); + + return j; +} + +/* I have no idea why this code works... and I'm the one who wrote it + * + * However, I do know what it does: + * Given a binary tree constructed in an array (i.e. how you normally implement + * a heap), it converts a node in the tree - referenced by array index - to the + * index it would have if you did an inorder traversal. + * + * Also tested for every j, size up to size somewhere around 6 million. + * + * The binary tree starts at array index 1, not 0 + * extra is a function of size: + * extra = (size - rounddown_pow_of_two(size - 1)) << 1; + */ +static unsigned __to_inorder(unsigned j, unsigned size, unsigned extra) +{ + unsigned b = fls(j); + unsigned shift = fls(size - 1) - b; + + j ^= 1U << (b - 1); + j <<= 1; + j |= 1; + j <<= shift; + + if (j > extra) + j -= (j - extra) >> 1; + + return j; +} + +static unsigned to_inorder(unsigned j, struct bset_tree *t) +{ + return __to_inorder(j, t->size, t->extra); +} + +static unsigned __inorder_to_tree(unsigned j, unsigned size, unsigned extra) +{ + unsigned shift; + + if (j > extra) + j += j - extra; + + shift = ffs(j); + + j >>= shift; + j |= roundup_pow_of_two(size) >> shift; + + return j; +} + +static unsigned inorder_to_tree(unsigned j, struct bset_tree *t) +{ + return __inorder_to_tree(j, t->size, t->extra); +} + +#if 0 +void inorder_test(void) +{ + unsigned long done = 0; + ktime_t start = ktime_get(); + + for (unsigned size = 2; + size < 65536000; + size++) { + unsigned extra = (size - rounddown_pow_of_two(size - 1)) << 1; + unsigned i = 1, j = rounddown_pow_of_two(size - 1); + + if (!(size % 4096)) + printk(KERN_NOTICE "loop %u, %llu per us\n", size, + done / ktime_us_delta(ktime_get(), start)); + + while (1) { + if (__inorder_to_tree(i, size, extra) != j) + panic("size %10u j %10u i %10u", size, j, i); + + if (__to_inorder(j, size, extra) != i) + panic("size %10u j %10u i %10u", size, j, i); + + if (j == rounddown_pow_of_two(size) - 1) + break; + + BUG_ON(inorder_prev(inorder_next(j, size), size) != j); + + j = inorder_next(j, size); + i++; + } + + done += size - 1; + } +} +#endif + +/* + * Cacheline/offset <-> bkey pointer arithmatic: + * + * t->tree is a binary search tree in an array; each node corresponds to a key + * in one cacheline in t->set (BSET_CACHELINE bytes). + * + * This means we don't have to store the full index of the key that a node in + * the binary tree points to; to_inorder() gives us the cacheline, and then + * bkey_float->m gives us the offset within that cacheline, in units of 8 bytes. + * + * cacheline_to_bkey() and friends abstract out all the pointer arithmatic to + * make this work. + * + * To construct the bfloat for an arbitrary key we need to know what the key + * immediately preceding it is: we have to check if the two keys differ in the + * bits we're going to store in bkey_float->mantissa. t->prev[j] stores the size + * of the previous key so we can walk backwards to it from t->tree[j]'s key. + */ + +static struct bkey *cacheline_to_bkey(struct bset_tree *t, unsigned cacheline, + unsigned offset) +{ + return ((void *) t->data) + cacheline * BSET_CACHELINE + offset * 8; +} + +static unsigned bkey_to_cacheline(struct bset_tree *t, struct bkey *k) +{ + return ((void *) k - (void *) t->data) / BSET_CACHELINE; +} + +static unsigned bkey_to_cacheline_offset(struct bkey *k) +{ + return ((size_t) k & (BSET_CACHELINE - 1)) / sizeof(uint64_t); +} + +static struct bkey *tree_to_bkey(struct bset_tree *t, unsigned j) +{ + return cacheline_to_bkey(t, to_inorder(j, t), t->tree[j].m); +} + +static struct bkey *tree_to_prev_bkey(struct bset_tree *t, unsigned j) +{ + return (void *) (((uint64_t *) tree_to_bkey(t, j)) - t->prev[j]); +} + +/* + * For the write set - the one we're currently inserting keys into - we don't + * maintain a full search tree, we just keep a simple lookup table in t->prev. + */ +static struct bkey *table_to_bkey(struct bset_tree *t, unsigned cacheline) +{ + return cacheline_to_bkey(t, cacheline, t->prev[cacheline]); +} + +static inline uint64_t shrd128(uint64_t high, uint64_t low, uint8_t shift) +{ +#ifdef CONFIG_X86_64 + asm("shrd %[shift],%[high],%[low]" + : [low] "+Rm" (low) + : [high] "R" (high), + [shift] "ci" (shift) + : "cc"); +#else + low >>= shift; + low |= (high << 1) << (63U - shift); +#endif + return low; +} + +static inline unsigned bfloat_mantissa(const struct bkey *k, + struct bkey_float *f) +{ + const uint64_t *p = &k->low - (f->exponent >> 6); + return shrd128(p[-1], p[0], f->exponent & 63) & BKEY_MANTISSA_MASK; +} + +static void make_bfloat(struct bset_tree *t, unsigned j) +{ + struct bkey_float *f = &t->tree[j]; + struct bkey *m = tree_to_bkey(t, j); + struct bkey *p = tree_to_prev_bkey(t, j); + + struct bkey *l = is_power_of_2(j) + ? t->data->start + : tree_to_prev_bkey(t, j >> ffs(j)); + + struct bkey *r = is_power_of_2(j + 1) + ? node(t->data, t->data->keys - bkey_u64s(&t->end)) + : tree_to_bkey(t, j >> (ffz(j) + 1)); + + BUG_ON(m < l || m > r); + BUG_ON(bkey_next(p) != m); + + if (KEY_INODE(l) != KEY_INODE(r)) + f->exponent = fls64(KEY_INODE(r) ^ KEY_INODE(l)) + 64; + else + f->exponent = fls64(r->low ^ l->low); + + f->exponent = max_t(int, f->exponent - BKEY_MANTISSA_BITS, 0); + + /* + * Setting f->exponent = 127 flags this node as failed, and causes the + * lookup code to fall back to comparing against the original key. + */ + + if (bfloat_mantissa(m, f) != bfloat_mantissa(p, f)) + f->mantissa = bfloat_mantissa(m, f) - 1; + else + f->exponent = 127; +} + +static void bset_alloc_tree(struct btree *b, struct bset_tree *t) +{ + if (t != b->sets) { + unsigned j = roundup(t[-1].size, + 64 / sizeof(struct bkey_float)); + + t->tree = t[-1].tree + j; + t->prev = t[-1].prev + j; + } + + while (t < b->sets + MAX_BSETS) + t++->size = 0; +} + +static void bset_build_unwritten_tree(struct btree *b) +{ + struct bset_tree *t = b->sets + b->nsets; + + bset_alloc_tree(b, t); + + if (t->tree != b->sets->tree + bset_tree_space(b)) { + t->prev[0] = bkey_to_cacheline_offset(t->data->start); + t->size = 1; + } +} + +static void bset_build_written_tree(struct btree *b) +{ + struct bset_tree *t = b->sets + b->nsets; + struct bkey *k = t->data->start; + unsigned j, cacheline = 1; + + bset_alloc_tree(b, t); + + t->size = min_t(unsigned, + bkey_to_cacheline(t, end(t->data)), + b->sets->tree + bset_tree_space(b) - t->tree); + + if (t->size < 2) { + t->size = 0; + return; + } + + t->extra = (t->size - rounddown_pow_of_two(t->size - 1)) << 1; + + /* First we figure out where the first key in each cacheline is */ + for (j = inorder_next(0, t->size); + j; + j = inorder_next(j, t->size)) { + while (bkey_to_cacheline(t, k) != cacheline) + k = bkey_next(k); + + t->prev[j] = bkey_u64s(k); + k = bkey_next(k); + cacheline++; + t->tree[j].m = bkey_to_cacheline_offset(k); + } + + while (bkey_next(k) != end(t->data)) + k = bkey_next(k); + + t->end = *k; + + /* Then we build the tree */ + for (j = inorder_next(0, t->size); + j; + j = inorder_next(j, t->size)) + make_bfloat(t, j); +} + +void bch_bset_fix_invalidated_key(struct btree *b, struct bkey *k) +{ + struct bset_tree *t; + unsigned inorder, j = 1; + + for (t = b->sets; t <= &b->sets[b->nsets]; t++) + if (k < end(t->data)) + goto found_set; + + BUG(); +found_set: + if (!t->size || !bset_written(b, t)) + return; + + inorder = bkey_to_cacheline(t, k); + + if (k == t->data->start) + goto fix_left; + + if (bkey_next(k) == end(t->data)) { + t->end = *k; + goto fix_right; + } + + j = inorder_to_tree(inorder, t); + + if (j && + j < t->size && + k == tree_to_bkey(t, j)) +fix_left: do { + make_bfloat(t, j); + j = j * 2; + } while (j < t->size); + + j = inorder_to_tree(inorder + 1, t); + + if (j && + j < t->size && + k == tree_to_prev_bkey(t, j)) +fix_right: do { + make_bfloat(t, j); + j = j * 2 + 1; + } while (j < t->size); +} + +void bch_bset_fix_lookup_table(struct btree *b, struct bkey *k) +{ + struct bset_tree *t = &b->sets[b->nsets]; + unsigned shift = bkey_u64s(k); + unsigned j = bkey_to_cacheline(t, k); + + /* We're getting called from btree_split() or btree_gc, just bail out */ + if (!t->size) + return; + + /* k is the key we just inserted; we need to find the entry in the + * lookup table for the first key that is strictly greater than k: + * it's either k's cacheline or the next one + */ + if (j < t->size && + table_to_bkey(t, j) <= k) + j++; + + /* Adjust all the lookup table entries, and find a new key for any that + * have gotten too big + */ + for (; j < t->size; j++) { + t->prev[j] += shift; + + if (t->prev[j] > 7) { + k = table_to_bkey(t, j - 1); + + while (k < cacheline_to_bkey(t, j, 0)) + k = bkey_next(k); + + t->prev[j] = bkey_to_cacheline_offset(k); + } + } + + if (t->size == b->sets->tree + bset_tree_space(b) - t->tree) + return; + + /* Possibly add a new entry to the end of the lookup table */ + + for (k = table_to_bkey(t, t->size - 1); + k != end(t->data); + k = bkey_next(k)) + if (t->size == bkey_to_cacheline(t, k)) { + t->prev[t->size] = bkey_to_cacheline_offset(k); + t->size++; + } +} + +void bch_bset_init_next(struct btree *b) +{ + struct bset *i = write_block(b); + + if (i != b->sets[0].data) { + b->sets[++b->nsets].data = i; + i->seq = b->sets[0].data->seq; + } else + get_random_bytes(&i->seq, sizeof(uint64_t)); + + i->magic = bset_magic(b->c); + i->version = 0; + i->keys = 0; + + bset_build_unwritten_tree(b); +} + +struct bset_search_iter { + struct bkey *l, *r; +}; + +static struct bset_search_iter bset_search_write_set(struct btree *b, + struct bset_tree *t, + const struct bkey *search) +{ + unsigned li = 0, ri = t->size; + + BUG_ON(!b->nsets && + t->size < bkey_to_cacheline(t, end(t->data))); + + while (li + 1 != ri) { + unsigned m = (li + ri) >> 1; + + if (bkey_cmp(table_to_bkey(t, m), search) > 0) + ri = m; + else + li = m; + } + + return (struct bset_search_iter) { + table_to_bkey(t, li), + ri < t->size ? table_to_bkey(t, ri) : end(t->data) + }; +} + +static struct bset_search_iter bset_search_tree(struct btree *b, + struct bset_tree *t, + const struct bkey *search) +{ + struct bkey *l, *r; + struct bkey_float *f; + unsigned inorder, j, n = 1; + + do { + unsigned p = n << 4; + p &= ((int) (p - t->size)) >> 31; + + prefetch(&t->tree[p]); + + j = n; + f = &t->tree[j]; + + /* + * n = (f->mantissa > bfloat_mantissa()) + * ? j * 2 + * : j * 2 + 1; + * + * We need to subtract 1 from f->mantissa for the sign bit trick + * to work - that's done in make_bfloat() + */ + if (likely(f->exponent != 127)) + n = j * 2 + (((unsigned) + (f->mantissa - + bfloat_mantissa(search, f))) >> 31); + else + n = (bkey_cmp(tree_to_bkey(t, j), search) > 0) + ? j * 2 + : j * 2 + 1; + } while (n < t->size); + + inorder = to_inorder(j, t); + + /* + * n would have been the node we recursed to - the low bit tells us if + * we recursed left or recursed right. + */ + if (n & 1) { + l = cacheline_to_bkey(t, inorder, f->m); + + if (++inorder != t->size) { + f = &t->tree[inorder_next(j, t->size)]; + r = cacheline_to_bkey(t, inorder, f->m); + } else + r = end(t->data); + } else { + r = cacheline_to_bkey(t, inorder, f->m); + + if (--inorder) { + f = &t->tree[inorder_prev(j, t->size)]; + l = cacheline_to_bkey(t, inorder, f->m); + } else + l = t->data->start; + } + + return (struct bset_search_iter) {l, r}; +} + +struct bkey *__bch_bset_search(struct btree *b, struct bset_tree *t, + const struct bkey *search) +{ + struct bset_search_iter i; + + /* + * First, we search for a cacheline, then lastly we do a linear search + * within that cacheline. + * + * To search for the cacheline, there's three different possibilities: + * * The set is too small to have a search tree, so we just do a linear + * search over the whole set. + * * The set is the one we're currently inserting into; keeping a full + * auxiliary search tree up to date would be too expensive, so we + * use a much simpler lookup table to do a binary search - + * bset_search_write_set(). + * * Or we use the auxiliary search tree we constructed earlier - + * bset_search_tree() + */ + + if (unlikely(!t->size)) { + i.l = t->data->start; + i.r = end(t->data); + } else if (bset_written(b, t)) { + /* + * Each node in the auxiliary search tree covers a certain range + * of bits, and keys above and below the set it covers might + * differ outside those bits - so we have to special case the + * start and end - handle that here: + */ + + if (unlikely(bkey_cmp(search, &t->end) >= 0)) + return end(t->data); + + if (unlikely(bkey_cmp(search, t->data->start) < 0)) + return t->data->start; + + i = bset_search_tree(b, t, search); + } else + i = bset_search_write_set(b, t, search); + +#ifdef CONFIG_BCACHE_EDEBUG + BUG_ON(bset_written(b, t) && + i.l != t->data->start && + bkey_cmp(tree_to_prev_bkey(t, + inorder_to_tree(bkey_to_cacheline(t, i.l), t)), + search) > 0); + + BUG_ON(i.r != end(t->data) && + bkey_cmp(i.r, search) <= 0); +#endif + + while (likely(i.l != i.r) && + bkey_cmp(i.l, search) <= 0) + i.l = bkey_next(i.l); + + return i.l; +} + +/* Btree iterator */ + +static inline bool btree_iter_cmp(struct btree_iter_set l, + struct btree_iter_set r) +{ + int64_t c = bkey_cmp(&START_KEY(l.k), &START_KEY(r.k)); + + return c ? c > 0 : l.k < r.k; +} + +static inline bool btree_iter_end(struct btree_iter *iter) +{ + return !iter->used; +} + +void bch_btree_iter_push(struct btree_iter *iter, struct bkey *k, + struct bkey *end) +{ + if (k != end) + BUG_ON(!heap_add(iter, + ((struct btree_iter_set) { k, end }), + btree_iter_cmp)); +} + +struct bkey *__bch_btree_iter_init(struct btree *b, struct btree_iter *iter, + struct bkey *search, struct bset_tree *start) +{ + struct bkey *ret = NULL; + iter->size = ARRAY_SIZE(iter->data); + iter->used = 0; + + for (; start <= &b->sets[b->nsets]; start++) { + ret = bch_bset_search(b, start, search); + bch_btree_iter_push(iter, ret, end(start->data)); + } + + return ret; +} + +struct bkey *bch_btree_iter_next(struct btree_iter *iter) +{ + struct btree_iter_set unused; + struct bkey *ret = NULL; + + if (!btree_iter_end(iter)) { + ret = iter->data->k; + iter->data->k = bkey_next(iter->data->k); + + if (iter->data->k > iter->data->end) { + WARN_ONCE(1, "bset was corrupt!\n"); + iter->data->k = iter->data->end; + } + + if (iter->data->k == iter->data->end) + heap_pop(iter, unused, btree_iter_cmp); + else + heap_sift(iter, 0, btree_iter_cmp); + } + + return ret; +} + +struct bkey *bch_btree_iter_next_filter(struct btree_iter *iter, + struct btree *b, ptr_filter_fn fn) +{ + struct bkey *ret; + + do { + ret = bch_btree_iter_next(iter); + } while (ret && fn(b, ret)); + + return ret; +} + +struct bkey *bch_next_recurse_key(struct btree *b, struct bkey *search) +{ + struct btree_iter iter; + + bch_btree_iter_init(b, &iter, search); + return bch_btree_iter_next_filter(&iter, b, bch_ptr_bad); +} + +/* Mergesort */ + +static void btree_sort_fixup(struct btree_iter *iter) +{ + while (iter->used > 1) { + struct btree_iter_set *top = iter->data, *i = top + 1; + struct bkey *k; + + if (iter->used > 2 && + btree_iter_cmp(i[0], i[1])) + i++; + + for (k = i->k; + k != i->end && bkey_cmp(top->k, &START_KEY(k)) > 0; + k = bkey_next(k)) + if (top->k > i->k) + __bch_cut_front(top->k, k); + else if (KEY_SIZE(k)) + bch_cut_back(&START_KEY(k), top->k); + + if (top->k < i->k || k == i->k) + break; + + heap_sift(iter, i - top, btree_iter_cmp); + } +} + +static void btree_mergesort(struct btree *b, struct bset *out, + struct btree_iter *iter, + bool fixup, bool remove_stale) +{ + struct bkey *k, *last = NULL; + bool (*bad)(struct btree *, const struct bkey *) = remove_stale + ? bch_ptr_bad + : bch_ptr_invalid; + + while (!btree_iter_end(iter)) { + if (fixup && !b->level) + btree_sort_fixup(iter); + + k = bch_btree_iter_next(iter); + if (bad(b, k)) + continue; + + if (!last) { + last = out->start; + bkey_copy(last, k); + } else if (b->level || + !bch_bkey_try_merge(b, last, k)) { + last = bkey_next(last); + bkey_copy(last, k); + } + } + + out->keys = last ? (uint64_t *) bkey_next(last) - out->d : 0; + + pr_debug("sorted %i keys", out->keys); + bch_check_key_order(b, out); +} + +static void __btree_sort(struct btree *b, struct btree_iter *iter, + unsigned start, unsigned order, bool fixup) +{ + uint64_t start_time; + bool remove_stale = !b->written; + struct bset *out = (void *) __get_free_pages(__GFP_NOWARN|GFP_NOIO, + order); + if (!out) { + mutex_lock(&b->c->sort_lock); + out = b->c->sort; + order = ilog2(bucket_pages(b->c)); + } + + start_time = local_clock(); + + btree_mergesort(b, out, iter, fixup, remove_stale); + b->nsets = start; + + if (!fixup && !start && b->written) + bch_btree_verify(b, out); + + if (!start && order == b->page_order) { + /* + * Our temporary buffer is the same size as the btree node's + * buffer, we can just swap buffers instead of doing a big + * memcpy() + */ + + out->magic = bset_magic(b->c); + out->seq = b->sets[0].data->seq; + out->version = b->sets[0].data->version; + swap(out, b->sets[0].data); + + if (b->c->sort == b->sets[0].data) + b->c->sort = out; + } else { + b->sets[start].data->keys = out->keys; + memcpy(b->sets[start].data->start, out->start, + (void *) end(out) - (void *) out->start); + } + + if (out == b->c->sort) + mutex_unlock(&b->c->sort_lock); + else + free_pages((unsigned long) out, order); + + if (b->written) + bset_build_written_tree(b); + + if (!start) { + spin_lock(&b->c->sort_time_lock); + bch_time_stats_update(&b->c->sort_time, start_time); + spin_unlock(&b->c->sort_time_lock); + } +} + +void bch_btree_sort_partial(struct btree *b, unsigned start) +{ + size_t oldsize = 0, order = b->page_order, keys = 0; + struct btree_iter iter; + __bch_btree_iter_init(b, &iter, NULL, &b->sets[start]); + + BUG_ON(b->sets[b->nsets].data == write_block(b) && + (b->sets[b->nsets].size || b->nsets)); + + if (b->written) + oldsize = bch_count_data(b); + + if (start) { + unsigned i; + + for (i = start; i <= b->nsets; i++) + keys += b->sets[i].data->keys; + + order = roundup_pow_of_two(__set_bytes(b->sets->data, + keys)) / PAGE_SIZE; + if (order) + order = ilog2(order); + } + + __btree_sort(b, &iter, start, order, false); + + EBUG_ON(b->written && bch_count_data(b) != oldsize); +} + +void bch_btree_sort_and_fix_extents(struct btree *b, struct btree_iter *iter) +{ + BUG_ON(!b->written); + __btree_sort(b, iter, 0, b->page_order, true); +} + +void bch_btree_sort_into(struct btree *b, struct btree *new) +{ + uint64_t start_time = local_clock(); + + struct btree_iter iter; + bch_btree_iter_init(b, &iter, NULL); + + btree_mergesort(b, new->sets->data, &iter, false, true); + + spin_lock(&b->c->sort_time_lock); + bch_time_stats_update(&b->c->sort_time, start_time); + spin_unlock(&b->c->sort_time_lock); + + bkey_copy_key(&new->key, &b->key); + new->sets->size = 0; +} + +void bch_btree_sort_lazy(struct btree *b) +{ + if (b->nsets) { + unsigned i, j, keys = 0, total; + + for (i = 0; i <= b->nsets; i++) + keys += b->sets[i].data->keys; + + total = keys; + + for (j = 0; j < b->nsets; j++) { + if (keys * 2 < total || + keys < 1000) { + bch_btree_sort_partial(b, j); + return; + } + + keys -= b->sets[j].data->keys; + } + + /* Must sort if b->nsets == 3 or we'll overflow */ + if (b->nsets >= (MAX_BSETS - 1) - b->level) { + bch_btree_sort(b); + return; + } + } + + bset_build_written_tree(b); +} + +/* Sysfs stuff */ + +struct bset_stats { + size_t nodes; + size_t sets_written, sets_unwritten; + size_t bytes_written, bytes_unwritten; + size_t floats, failed; +}; + +static int bch_btree_bset_stats(struct btree *b, struct btree_op *op, + struct bset_stats *stats) +{ + struct bkey *k; + unsigned i; + + stats->nodes++; + + for (i = 0; i <= b->nsets; i++) { + struct bset_tree *t = &b->sets[i]; + size_t bytes = t->data->keys * sizeof(uint64_t); + size_t j; + + if (bset_written(b, t)) { + stats->sets_written++; + stats->bytes_written += bytes; + + stats->floats += t->size - 1; + + for (j = 1; j < t->size; j++) + if (t->tree[j].exponent == 127) + stats->failed++; + } else { + stats->sets_unwritten++; + stats->bytes_unwritten += bytes; + } + } + + if (b->level) { + struct btree_iter iter; + + for_each_key_filter(b, k, &iter, bch_ptr_bad) { + int ret = btree(bset_stats, k, b, op, stats); + if (ret) + return ret; + } + } + + return 0; +} + +int bch_bset_print_stats(struct cache_set *c, char *buf) +{ + struct btree_op op; + struct bset_stats t; + int ret; + + bch_btree_op_init_stack(&op); + memset(&t, 0, sizeof(struct bset_stats)); + + ret = btree_root(bset_stats, c, &op, &t); + if (ret) + return ret; + + return snprintf(buf, PAGE_SIZE, + "btree nodes: %zu\n" + "written sets: %zu\n" + "unwritten sets: %zu\n" + "written key bytes: %zu\n" + "unwritten key bytes: %zu\n" + "floats: %zu\n" + "failed: %zu\n", + t.nodes, + t.sets_written, t.sets_unwritten, + t.bytes_written, t.bytes_unwritten, + t.floats, t.failed); +} diff --git a/drivers/md/bcache/bset.h b/drivers/md/bcache/bset.h new file mode 100644 index 000000000000..57a9cff41546 --- /dev/null +++ b/drivers/md/bcache/bset.h @@ -0,0 +1,379 @@ +#ifndef _BCACHE_BSET_H +#define _BCACHE_BSET_H + +/* + * BKEYS: + * + * A bkey contains a key, a size field, a variable number of pointers, and some + * ancillary flag bits. + * + * We use two different functions for validating bkeys, bch_ptr_invalid and + * bch_ptr_bad(). + * + * bch_ptr_invalid() primarily filters out keys and pointers that would be + * invalid due to some sort of bug, whereas bch_ptr_bad() filters out keys and + * pointer that occur in normal practice but don't point to real data. + * + * The one exception to the rule that ptr_invalid() filters out invalid keys is + * that it also filters out keys of size 0 - these are keys that have been + * completely overwritten. It'd be safe to delete these in memory while leaving + * them on disk, just unnecessary work - so we filter them out when resorting + * instead. + * + * We can't filter out stale keys when we're resorting, because garbage + * collection needs to find them to ensure bucket gens don't wrap around - + * unless we're rewriting the btree node those stale keys still exist on disk. + * + * We also implement functions here for removing some number of sectors from the + * front or the back of a bkey - this is mainly used for fixing overlapping + * extents, by removing the overlapping sectors from the older key. + * + * BSETS: + * + * A bset is an array of bkeys laid out contiguously in memory in sorted order, + * along with a header. A btree node is made up of a number of these, written at + * different times. + * + * There could be many of them on disk, but we never allow there to be more than + * 4 in memory - we lazily resort as needed. + * + * We implement code here for creating and maintaining auxiliary search trees + * (described below) for searching an individial bset, and on top of that we + * implement a btree iterator. + * + * BTREE ITERATOR: + * + * Most of the code in bcache doesn't care about an individual bset - it needs + * to search entire btree nodes and iterate over them in sorted order. + * + * The btree iterator code serves both functions; it iterates through the keys + * in a btree node in sorted order, starting from either keys after a specific + * point (if you pass it a search key) or the start of the btree node. + * + * AUXILIARY SEARCH TREES: + * + * Since keys are variable length, we can't use a binary search on a bset - we + * wouldn't be able to find the start of the next key. But binary searches are + * slow anyways, due to terrible cache behaviour; bcache originally used binary + * searches and that code topped out at under 50k lookups/second. + * + * So we need to construct some sort of lookup table. Since we only insert keys + * into the last (unwritten) set, most of the keys within a given btree node are + * usually in sets that are mostly constant. We use two different types of + * lookup tables to take advantage of this. + * + * Both lookup tables share in common that they don't index every key in the + * set; they index one key every BSET_CACHELINE bytes, and then a linear search + * is used for the rest. + * + * For sets that have been written to disk and are no longer being inserted + * into, we construct a binary search tree in an array - traversing a binary + * search tree in an array gives excellent locality of reference and is very + * fast, since both children of any node are adjacent to each other in memory + * (and their grandchildren, and great grandchildren...) - this means + * prefetching can be used to great effect. + * + * It's quite useful performance wise to keep these nodes small - not just + * because they're more likely to be in L2, but also because we can prefetch + * more nodes on a single cacheline and thus prefetch more iterations in advance + * when traversing this tree. + * + * Nodes in the auxiliary search tree must contain both a key to compare against + * (we don't want to fetch the key from the set, that would defeat the purpose), + * and a pointer to the key. We use a few tricks to compress both of these. + * + * To compress the pointer, we take advantage of the fact that one node in the + * search tree corresponds to precisely BSET_CACHELINE bytes in the set. We have + * a function (to_inorder()) that takes the index of a node in a binary tree and + * returns what its index would be in an inorder traversal, so we only have to + * store the low bits of the offset. + * + * The key is 84 bits (KEY_DEV + key->key, the offset on the device). To + * compress that, we take advantage of the fact that when we're traversing the + * search tree at every iteration we know that both our search key and the key + * we're looking for lie within some range - bounded by our previous + * comparisons. (We special case the start of a search so that this is true even + * at the root of the tree). + * + * So we know the key we're looking for is between a and b, and a and b don't + * differ higher than bit 50, we don't need to check anything higher than bit + * 50. + * + * We don't usually need the rest of the bits, either; we only need enough bits + * to partition the key range we're currently checking. Consider key n - the + * key our auxiliary search tree node corresponds to, and key p, the key + * immediately preceding n. The lowest bit we need to store in the auxiliary + * search tree is the highest bit that differs between n and p. + * + * Note that this could be bit 0 - we might sometimes need all 80 bits to do the + * comparison. But we'd really like our nodes in the auxiliary search tree to be + * of fixed size. + * + * The solution is to make them fixed size, and when we're constructing a node + * check if p and n differed in the bits we needed them to. If they don't we + * flag that node, and when doing lookups we fallback to comparing against the + * real key. As long as this doesn't happen to often (and it seems to reliably + * happen a bit less than 1% of the time), we win - even on failures, that key + * is then more likely to be in cache than if we were doing binary searches all + * the way, since we're touching so much less memory. + * + * The keys in the auxiliary search tree are stored in (software) floating + * point, with an exponent and a mantissa. The exponent needs to be big enough + * to address all the bits in the original key, but the number of bits in the + * mantissa is somewhat arbitrary; more bits just gets us fewer failures. + * + * We need 7 bits for the exponent and 3 bits for the key's offset (since keys + * are 8 byte aligned); using 22 bits for the mantissa means a node is 4 bytes. + * We need one node per 128 bytes in the btree node, which means the auxiliary + * search trees take up 3% as much memory as the btree itself. + * + * Constructing these auxiliary search trees is moderately expensive, and we + * don't want to be constantly rebuilding the search tree for the last set + * whenever we insert another key into it. For the unwritten set, we use a much + * simpler lookup table - it's just a flat array, so index i in the lookup table + * corresponds to the i range of BSET_CACHELINE bytes in the set. Indexing + * within each byte range works the same as with the auxiliary search trees. + * + * These are much easier to keep up to date when we insert a key - we do it + * somewhat lazily; when we shift a key up we usually just increment the pointer + * to it, only when it would overflow do we go to the trouble of finding the + * first key in that range of bytes again. + */ + +/* Btree key comparison/iteration */ + +struct btree_iter { + size_t size, used; + struct btree_iter_set { + struct bkey *k, *end; + } data[MAX_BSETS]; +}; + +struct bset_tree { + /* + * We construct a binary tree in an array as if the array + * started at 1, so that things line up on the same cachelines + * better: see comments in bset.c at cacheline_to_bkey() for + * details + */ + + /* size of the binary tree and prev array */ + unsigned size; + + /* function of size - precalculated for to_inorder() */ + unsigned extra; + + /* copy of the last key in the set */ + struct bkey end; + struct bkey_float *tree; + + /* + * The nodes in the bset tree point to specific keys - this + * array holds the sizes of the previous key. + * + * Conceptually it's a member of struct bkey_float, but we want + * to keep bkey_float to 4 bytes and prev isn't used in the fast + * path. + */ + uint8_t *prev; + + /* The actual btree node, with pointers to each sorted set */ + struct bset *data; +}; + +static __always_inline int64_t bkey_cmp(const struct bkey *l, + const struct bkey *r) +{ + return unlikely(KEY_INODE(l) != KEY_INODE(r)) + ? (int64_t) KEY_INODE(l) - (int64_t) KEY_INODE(r) + : (int64_t) KEY_OFFSET(l) - (int64_t) KEY_OFFSET(r); +} + +static inline size_t bkey_u64s(const struct bkey *k) +{ + BUG_ON(KEY_CSUM(k) > 1); + return 2 + KEY_PTRS(k) + (KEY_CSUM(k) ? 1 : 0); +} + +static inline size_t bkey_bytes(const struct bkey *k) +{ + return bkey_u64s(k) * sizeof(uint64_t); +} + +static inline void bkey_copy(struct bkey *dest, const struct bkey *src) +{ + memcpy(dest, src, bkey_bytes(src)); +} + +static inline void bkey_copy_key(struct bkey *dest, const struct bkey *src) +{ + if (!src) + src = &KEY(0, 0, 0); + + SET_KEY_INODE(dest, KEY_INODE(src)); + SET_KEY_OFFSET(dest, KEY_OFFSET(src)); +} + +static inline struct bkey *bkey_next(const struct bkey *k) +{ + uint64_t *d = (void *) k; + return (struct bkey *) (d + bkey_u64s(k)); +} + +/* Keylists */ + +struct keylist { + struct bkey *top; + union { + uint64_t *list; + struct bkey *bottom; + }; + + /* Enough room for btree_split's keys without realloc */ +#define KEYLIST_INLINE 16 + uint64_t d[KEYLIST_INLINE]; +}; + +static inline void bch_keylist_init(struct keylist *l) +{ + l->top = (void *) (l->list = l->d); +} + +static inline void bch_keylist_push(struct keylist *l) +{ + l->top = bkey_next(l->top); +} + +static inline void bch_keylist_add(struct keylist *l, struct bkey *k) +{ + bkey_copy(l->top, k); + bch_keylist_push(l); +} + +static inline bool bch_keylist_empty(struct keylist *l) +{ + return l->top == (void *) l->list; +} + +static inline void bch_keylist_free(struct keylist *l) +{ + if (l->list != l->d) + kfree(l->list); +} + +void bch_keylist_copy(struct keylist *, struct keylist *); +struct bkey *bch_keylist_pop(struct keylist *); +int bch_keylist_realloc(struct keylist *, int, struct cache_set *); + +void bch_bkey_copy_single_ptr(struct bkey *, const struct bkey *, + unsigned); +bool __bch_cut_front(const struct bkey *, struct bkey *); +bool __bch_cut_back(const struct bkey *, struct bkey *); + +static inline bool bch_cut_front(const struct bkey *where, struct bkey *k) +{ + BUG_ON(bkey_cmp(where, k) > 0); + return __bch_cut_front(where, k); +} + +static inline bool bch_cut_back(const struct bkey *where, struct bkey *k) +{ + BUG_ON(bkey_cmp(where, &START_KEY(k)) < 0); + return __bch_cut_back(where, k); +} + +const char *bch_ptr_status(struct cache_set *, const struct bkey *); +bool __bch_ptr_invalid(struct cache_set *, int level, const struct bkey *); +bool bch_ptr_bad(struct btree *, const struct bkey *); + +static inline uint8_t gen_after(uint8_t a, uint8_t b) +{ + uint8_t r = a - b; + return r > 128U ? 0 : r; +} + +static inline uint8_t ptr_stale(struct cache_set *c, const struct bkey *k, + unsigned i) +{ + return gen_after(PTR_BUCKET(c, k, i)->gen, PTR_GEN(k, i)); +} + +static inline bool ptr_available(struct cache_set *c, const struct bkey *k, + unsigned i) +{ + return (PTR_DEV(k, i) < MAX_CACHES_PER_SET) && PTR_CACHE(c, k, i); +} + + +typedef bool (*ptr_filter_fn)(struct btree *, const struct bkey *); + +struct bkey *bch_next_recurse_key(struct btree *, struct bkey *); +struct bkey *bch_btree_iter_next(struct btree_iter *); +struct bkey *bch_btree_iter_next_filter(struct btree_iter *, + struct btree *, ptr_filter_fn); + +void bch_btree_iter_push(struct btree_iter *, struct bkey *, struct bkey *); +struct bkey *__bch_btree_iter_init(struct btree *, struct btree_iter *, + struct bkey *, struct bset_tree *); + +/* 32 bits total: */ +#define BKEY_MID_BITS 3 +#define BKEY_EXPONENT_BITS 7 +#define BKEY_MANTISSA_BITS 22 +#define BKEY_MANTISSA_MASK ((1 << BKEY_MANTISSA_BITS) - 1) + +struct bkey_float { + unsigned exponent:BKEY_EXPONENT_BITS; + unsigned m:BKEY_MID_BITS; + unsigned mantissa:BKEY_MANTISSA_BITS; +} __packed; + +/* + * BSET_CACHELINE was originally intended to match the hardware cacheline size - + * it used to be 64, but I realized the lookup code would touch slightly less + * memory if it was 128. + * + * It definites the number of bytes (in struct bset) per struct bkey_float in + * the auxiliar search tree - when we're done searching the bset_float tree we + * have this many bytes left that we do a linear search over. + * + * Since (after level 5) every level of the bset_tree is on a new cacheline, + * we're touching one fewer cacheline in the bset tree in exchange for one more + * cacheline in the linear search - but the linear search might stop before it + * gets to the second cacheline. + */ + +#define BSET_CACHELINE 128 +#define bset_tree_space(b) (btree_data_space(b) / BSET_CACHELINE) + +#define bset_tree_bytes(b) (bset_tree_space(b) * sizeof(struct bkey_float)) +#define bset_prev_bytes(b) (bset_tree_space(b) * sizeof(uint8_t)) + +void bch_bset_init_next(struct btree *); + +void bch_bset_fix_invalidated_key(struct btree *, struct bkey *); +void bch_bset_fix_lookup_table(struct btree *, struct bkey *); + +struct bkey *__bch_bset_search(struct btree *, struct bset_tree *, + const struct bkey *); + +static inline struct bkey *bch_bset_search(struct btree *b, struct bset_tree *t, + const struct bkey *search) +{ + return search ? __bch_bset_search(b, t, search) : t->data->start; +} + +bool bch_bkey_try_merge(struct btree *, struct bkey *, struct bkey *); +void bch_btree_sort_lazy(struct btree *); +void bch_btree_sort_into(struct btree *, struct btree *); +void bch_btree_sort_and_fix_extents(struct btree *, struct btree_iter *); +void bch_btree_sort_partial(struct btree *, unsigned); + +static inline void bch_btree_sort(struct btree *b) +{ + bch_btree_sort_partial(b, 0); +} + +int bch_bset_print_stats(struct cache_set *, char *); + +#endif diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c new file mode 100644 index 000000000000..7a5658f04e62 --- /dev/null +++ b/drivers/md/bcache/btree.c @@ -0,0 +1,2503 @@ +/* + * Copyright (C) 2010 Kent Overstreet + * + * Uses a block device as cache for other block devices; optimized for SSDs. + * All allocation is done in buckets, which should match the erase block size + * of the device. + * + * Buckets containing cached data are kept on a heap sorted by priority; + * bucket priority is increased on cache hit, and periodically all the buckets + * on the heap have their priority scaled down. This currently is just used as + * an LRU but in the future should allow for more intelligent heuristics. + * + * Buckets have an 8 bit counter; freeing is accomplished by incrementing the + * counter. Garbage collection is used to remove stale pointers. + * + * Indexing is done via a btree; nodes are not necessarily fully sorted, rather + * as keys are inserted we only sort the pages that have not yet been written. + * When garbage collection is run, we resort the entire node. + * + * All configuration is done via sysfs; see Documentation/bcache.txt. + */ + +#include "bcache.h" +#include "btree.h" +#include "debug.h" +#include "request.h" + +#include +#include +#include +#include +#include +#include +#include + +/* + * Todo: + * register_bcache: Return errors out to userspace correctly + * + * Writeback: don't undirty key until after a cache flush + * + * Create an iterator for key pointers + * + * On btree write error, mark bucket such that it won't be freed from the cache + * + * Journalling: + * Check for bad keys in replay + * Propagate barriers + * Refcount journal entries in journal_replay + * + * Garbage collection: + * Finish incremental gc + * Gc should free old UUIDs, data for invalid UUIDs + * + * Provide a way to list backing device UUIDs we have data cached for, and + * probably how long it's been since we've seen them, and a way to invalidate + * dirty data for devices that will never be attached again + * + * Keep 1 min/5 min/15 min statistics of how busy a block device has been, so + * that based on that and how much dirty data we have we can keep writeback + * from being starved + * + * Add a tracepoint or somesuch to watch for writeback starvation + * + * When btree depth > 1 and splitting an interior node, we have to make sure + * alloc_bucket() cannot fail. This should be true but is not completely + * obvious. + * + * Make sure all allocations get charged to the root cgroup + * + * Plugging? + * + * If data write is less than hard sector size of ssd, round up offset in open + * bucket to the next whole sector + * + * Also lookup by cgroup in get_open_bucket() + * + * Superblock needs to be fleshed out for multiple cache devices + * + * Add a sysfs tunable for the number of writeback IOs in flight + * + * Add a sysfs tunable for the number of open data buckets + * + * IO tracking: Can we track when one process is doing io on behalf of another? + * IO tracking: Don't use just an average, weigh more recent stuff higher + * + * Test module load/unload + */ + +static const char * const op_types[] = { + "insert", "replace" +}; + +static const char *op_type(struct btree_op *op) +{ + return op_types[op->type]; +} + +#define MAX_NEED_GC 64 +#define MAX_SAVE_PRIO 72 + +#define PTR_DIRTY_BIT (((uint64_t) 1 << 36)) + +#define PTR_HASH(c, k) \ + (((k)->ptr[0] >> c->bucket_bits) | PTR_GEN(k, 0)) + +struct workqueue_struct *bch_gc_wq; +static struct workqueue_struct *btree_io_wq; + +void bch_btree_op_init_stack(struct btree_op *op) +{ + memset(op, 0, sizeof(struct btree_op)); + closure_init_stack(&op->cl); + op->lock = -1; + bch_keylist_init(&op->keys); +} + +/* Btree key manipulation */ + +static void bkey_put(struct cache_set *c, struct bkey *k, int level) +{ + if ((level && KEY_OFFSET(k)) || !level) + __bkey_put(c, k); +} + +/* Btree IO */ + +static uint64_t btree_csum_set(struct btree *b, struct bset *i) +{ + uint64_t crc = b->key.ptr[0]; + void *data = (void *) i + 8, *end = end(i); + + crc = bch_crc64_update(crc, data, end - data); + return crc ^ 0xffffffffffffffffULL; +} + +static void btree_bio_endio(struct bio *bio, int error) +{ + struct closure *cl = bio->bi_private; + struct btree *b = container_of(cl, struct btree, io.cl); + + if (error) + set_btree_node_io_error(b); + + bch_bbio_count_io_errors(b->c, bio, error, (bio->bi_rw & WRITE) + ? "writing btree" : "reading btree"); + closure_put(cl); +} + +static void btree_bio_init(struct btree *b) +{ + BUG_ON(b->bio); + b->bio = bch_bbio_alloc(b->c); + + b->bio->bi_end_io = btree_bio_endio; + b->bio->bi_private = &b->io.cl; +} + +void bch_btree_read_done(struct closure *cl) +{ + struct btree *b = container_of(cl, struct btree, io.cl); + struct bset *i = b->sets[0].data; + struct btree_iter *iter = b->c->fill_iter; + const char *err = "bad btree header"; + BUG_ON(b->nsets || b->written); + + bch_bbio_free(b->bio, b->c); + b->bio = NULL; + + mutex_lock(&b->c->fill_lock); + iter->used = 0; + + if (btree_node_io_error(b) || + !i->seq) + goto err; + + for (; + b->written < btree_blocks(b) && i->seq == b->sets[0].data->seq; + i = write_block(b)) { + err = "unsupported bset version"; + if (i->version > BCACHE_BSET_VERSION) + goto err; + + err = "bad btree header"; + if (b->written + set_blocks(i, b->c) > btree_blocks(b)) + goto err; + + err = "bad magic"; + if (i->magic != bset_magic(b->c)) + goto err; + + err = "bad checksum"; + switch (i->version) { + case 0: + if (i->csum != csum_set(i)) + goto err; + break; + case BCACHE_BSET_VERSION: + if (i->csum != btree_csum_set(b, i)) + goto err; + break; + } + + err = "empty set"; + if (i != b->sets[0].data && !i->keys) + goto err; + + bch_btree_iter_push(iter, i->start, end(i)); + + b->written += set_blocks(i, b->c); + } + + err = "corrupted btree"; + for (i = write_block(b); + index(i, b) < btree_blocks(b); + i = ((void *) i) + block_bytes(b->c)) + if (i->seq == b->sets[0].data->seq) + goto err; + + bch_btree_sort_and_fix_extents(b, iter); + + i = b->sets[0].data; + err = "short btree key"; + if (b->sets[0].size && + bkey_cmp(&b->key, &b->sets[0].end) < 0) + goto err; + + if (b->written < btree_blocks(b)) + bch_bset_init_next(b); +out: + + mutex_unlock(&b->c->fill_lock); + + spin_lock(&b->c->btree_read_time_lock); + bch_time_stats_update(&b->c->btree_read_time, b->io_start_time); + spin_unlock(&b->c->btree_read_time_lock); + + smp_wmb(); /* read_done is our write lock */ + set_btree_node_read_done(b); + + closure_return(cl); +err: + set_btree_node_io_error(b); + bch_cache_set_error(b->c, "%s at bucket %zu, block %zu, %u keys", + err, PTR_BUCKET_NR(b->c, &b->key, 0), + index(i, b), i->keys); + goto out; +} + +void bch_btree_read(struct btree *b) +{ + BUG_ON(b->nsets || b->written); + + if (!closure_trylock(&b->io.cl, &b->c->cl)) + BUG(); + + b->io_start_time = local_clock(); + + btree_bio_init(b); + b->bio->bi_rw = REQ_META|READ_SYNC; + b->bio->bi_size = KEY_SIZE(&b->key) << 9; + + bch_bio_map(b->bio, b->sets[0].data); + + pr_debug("%s", pbtree(b)); + trace_bcache_btree_read(b->bio); + bch_submit_bbio(b->bio, b->c, &b->key, 0); + + continue_at(&b->io.cl, bch_btree_read_done, system_wq); +} + +static void btree_complete_write(struct btree *b, struct btree_write *w) +{ + if (w->prio_blocked && + !atomic_sub_return(w->prio_blocked, &b->c->prio_blocked)) + wake_up(&b->c->alloc_wait); + + if (w->journal) { + atomic_dec_bug(w->journal); + __closure_wake_up(&b->c->journal.wait); + } + + if (w->owner) + closure_put(w->owner); + + w->prio_blocked = 0; + w->journal = NULL; + w->owner = NULL; +} + +static void __btree_write_done(struct closure *cl) +{ + struct btree *b = container_of(cl, struct btree, io.cl); + struct btree_write *w = btree_prev_write(b); + + bch_bbio_free(b->bio, b->c); + b->bio = NULL; + btree_complete_write(b, w); + + if (btree_node_dirty(b)) + queue_delayed_work(btree_io_wq, &b->work, + msecs_to_jiffies(30000)); + + closure_return(cl); +} + +static void btree_write_done(struct closure *cl) +{ + struct btree *b = container_of(cl, struct btree, io.cl); + struct bio_vec *bv; + int n; + + __bio_for_each_segment(bv, b->bio, n, 0) + __free_page(bv->bv_page); + + __btree_write_done(cl); +} + +static void do_btree_write(struct btree *b) +{ + struct closure *cl = &b->io.cl; + struct bset *i = b->sets[b->nsets].data; + BKEY_PADDED(key) k; + + i->version = BCACHE_BSET_VERSION; + i->csum = btree_csum_set(b, i); + + btree_bio_init(b); + b->bio->bi_rw = REQ_META|WRITE_SYNC; + b->bio->bi_size = set_blocks(i, b->c) * block_bytes(b->c); + bch_bio_map(b->bio, i); + + bkey_copy(&k.key, &b->key); + SET_PTR_OFFSET(&k.key, 0, PTR_OFFSET(&k.key, 0) + bset_offset(b, i)); + + if (!bch_bio_alloc_pages(b->bio, GFP_NOIO)) { + int j; + struct bio_vec *bv; + void *base = (void *) ((unsigned long) i & ~(PAGE_SIZE - 1)); + + bio_for_each_segment(bv, b->bio, j) + memcpy(page_address(bv->bv_page), + base + j * PAGE_SIZE, PAGE_SIZE); + + trace_bcache_btree_write(b->bio); + bch_submit_bbio(b->bio, b->c, &k.key, 0); + + continue_at(cl, btree_write_done, NULL); + } else { + b->bio->bi_vcnt = 0; + bch_bio_map(b->bio, i); + + trace_bcache_btree_write(b->bio); + bch_submit_bbio(b->bio, b->c, &k.key, 0); + + closure_sync(cl); + __btree_write_done(cl); + } +} + +static void __btree_write(struct btree *b) +{ + struct bset *i = b->sets[b->nsets].data; + + BUG_ON(current->bio_list); + + closure_lock(&b->io, &b->c->cl); + cancel_delayed_work(&b->work); + + clear_bit(BTREE_NODE_dirty, &b->flags); + change_bit(BTREE_NODE_write_idx, &b->flags); + + bch_check_key_order(b, i); + BUG_ON(b->written && !i->keys); + + do_btree_write(b); + + pr_debug("%s block %i keys %i", pbtree(b), b->written, i->keys); + + b->written += set_blocks(i, b->c); + atomic_long_add(set_blocks(i, b->c) * b->c->sb.block_size, + &PTR_CACHE(b->c, &b->key, 0)->btree_sectors_written); + + bch_btree_sort_lazy(b); + + if (b->written < btree_blocks(b)) + bch_bset_init_next(b); +} + +static void btree_write_work(struct work_struct *w) +{ + struct btree *b = container_of(to_delayed_work(w), struct btree, work); + + down_write(&b->lock); + + if (btree_node_dirty(b)) + __btree_write(b); + up_write(&b->lock); +} + +void bch_btree_write(struct btree *b, bool now, struct btree_op *op) +{ + struct bset *i = b->sets[b->nsets].data; + struct btree_write *w = btree_current_write(b); + + BUG_ON(b->written && + (b->written >= btree_blocks(b) || + i->seq != b->sets[0].data->seq || + !i->keys)); + + if (!btree_node_dirty(b)) { + set_btree_node_dirty(b); + queue_delayed_work(btree_io_wq, &b->work, + msecs_to_jiffies(30000)); + } + + w->prio_blocked += b->prio_blocked; + b->prio_blocked = 0; + + if (op && op->journal && !b->level) { + if (w->journal && + journal_pin_cmp(b->c, w, op)) { + atomic_dec_bug(w->journal); + w->journal = NULL; + } + + if (!w->journal) { + w->journal = op->journal; + atomic_inc(w->journal); + } + } + + if (current->bio_list) + return; + + /* Force write if set is too big */ + if (now || + b->level || + set_bytes(i) > PAGE_SIZE - 48) { + if (op && now) { + /* Must wait on multiple writes */ + BUG_ON(w->owner); + w->owner = &op->cl; + closure_get(&op->cl); + } + + __btree_write(b); + } + BUG_ON(!b->written); +} + +/* + * Btree in memory cache - allocation/freeing + * mca -> memory cache + */ + +static void mca_reinit(struct btree *b) +{ + unsigned i; + + b->flags = 0; + b->written = 0; + b->nsets = 0; + + for (i = 0; i < MAX_BSETS; i++) + b->sets[i].size = 0; + /* + * Second loop starts at 1 because b->sets[0]->data is the memory we + * allocated + */ + for (i = 1; i < MAX_BSETS; i++) + b->sets[i].data = NULL; +} + +#define mca_reserve(c) (((c->root && c->root->level) \ + ? c->root->level : 1) * 8 + 16) +#define mca_can_free(c) \ + max_t(int, 0, c->bucket_cache_used - mca_reserve(c)) + +static void mca_data_free(struct btree *b) +{ + struct bset_tree *t = b->sets; + BUG_ON(!closure_is_unlocked(&b->io.cl)); + + if (bset_prev_bytes(b) < PAGE_SIZE) + kfree(t->prev); + else + free_pages((unsigned long) t->prev, + get_order(bset_prev_bytes(b))); + + if (bset_tree_bytes(b) < PAGE_SIZE) + kfree(t->tree); + else + free_pages((unsigned long) t->tree, + get_order(bset_tree_bytes(b))); + + free_pages((unsigned long) t->data, b->page_order); + + t->prev = NULL; + t->tree = NULL; + t->data = NULL; + list_move(&b->list, &b->c->btree_cache_freed); + b->c->bucket_cache_used--; +} + +static void mca_bucket_free(struct btree *b) +{ + BUG_ON(btree_node_dirty(b)); + + b->key.ptr[0] = 0; + hlist_del_init_rcu(&b->hash); + list_move(&b->list, &b->c->btree_cache_freeable); +} + +static unsigned btree_order(struct bkey *k) +{ + return ilog2(KEY_SIZE(k) / PAGE_SECTORS ?: 1); +} + +static void mca_data_alloc(struct btree *b, struct bkey *k, gfp_t gfp) +{ + struct bset_tree *t = b->sets; + BUG_ON(t->data); + + b->page_order = max_t(unsigned, + ilog2(b->c->btree_pages), + btree_order(k)); + + t->data = (void *) __get_free_pages(gfp, b->page_order); + if (!t->data) + goto err; + + t->tree = bset_tree_bytes(b) < PAGE_SIZE + ? kmalloc(bset_tree_bytes(b), gfp) + : (void *) __get_free_pages(gfp, get_order(bset_tree_bytes(b))); + if (!t->tree) + goto err; + + t->prev = bset_prev_bytes(b) < PAGE_SIZE + ? kmalloc(bset_prev_bytes(b), gfp) + : (void *) __get_free_pages(gfp, get_order(bset_prev_bytes(b))); + if (!t->prev) + goto err; + + list_move(&b->list, &b->c->btree_cache); + b->c->bucket_cache_used++; + return; +err: + mca_data_free(b); +} + +static struct btree *mca_bucket_alloc(struct cache_set *c, + struct bkey *k, gfp_t gfp) +{ + struct btree *b = kzalloc(sizeof(struct btree), gfp); + if (!b) + return NULL; + + init_rwsem(&b->lock); + lockdep_set_novalidate_class(&b->lock); + INIT_LIST_HEAD(&b->list); + INIT_DELAYED_WORK(&b->work, btree_write_work); + b->c = c; + closure_init_unlocked(&b->io); + + mca_data_alloc(b, k, gfp); + return b; +} + +static int mca_reap(struct btree *b, struct closure *cl, unsigned min_order) +{ + lockdep_assert_held(&b->c->bucket_lock); + + if (!down_write_trylock(&b->lock)) + return -ENOMEM; + + if (b->page_order < min_order) { + rw_unlock(true, b); + return -ENOMEM; + } + + BUG_ON(btree_node_dirty(b) && !b->sets[0].data); + + if (cl && btree_node_dirty(b)) + bch_btree_write(b, true, NULL); + + if (cl) + closure_wait_event_async(&b->io.wait, cl, + atomic_read(&b->io.cl.remaining) == -1); + + if (btree_node_dirty(b) || + !closure_is_unlocked(&b->io.cl) || + work_pending(&b->work.work)) { + rw_unlock(true, b); + return -EAGAIN; + } + + return 0; +} + +static int bch_mca_shrink(struct shrinker *shrink, struct shrink_control *sc) +{ + struct cache_set *c = container_of(shrink, struct cache_set, shrink); + struct btree *b, *t; + unsigned long i, nr = sc->nr_to_scan; + + if (c->shrinker_disabled) + return 0; + + if (c->try_harder) + return 0; + + /* + * If nr == 0, we're supposed to return the number of items we have + * cached. Not allowed to return -1. + */ + if (!nr) + return mca_can_free(c) * c->btree_pages; + + /* Return -1 if we can't do anything right now */ + if (sc->gfp_mask & __GFP_WAIT) + mutex_lock(&c->bucket_lock); + else if (!mutex_trylock(&c->bucket_lock)) + return -1; + + nr /= c->btree_pages; + nr = min_t(unsigned long, nr, mca_can_free(c)); + + i = 0; + list_for_each_entry_safe(b, t, &c->btree_cache_freeable, list) { + if (!nr) + break; + + if (++i > 3 && + !mca_reap(b, NULL, 0)) { + mca_data_free(b); + rw_unlock(true, b); + --nr; + } + } + + /* + * Can happen right when we first start up, before we've read in any + * btree nodes + */ + if (list_empty(&c->btree_cache)) + goto out; + + for (i = 0; nr && i < c->bucket_cache_used; i++) { + b = list_first_entry(&c->btree_cache, struct btree, list); + list_rotate_left(&c->btree_cache); + + if (!b->accessed && + !mca_reap(b, NULL, 0)) { + mca_bucket_free(b); + mca_data_free(b); + rw_unlock(true, b); + --nr; + } else + b->accessed = 0; + } +out: + nr = mca_can_free(c) * c->btree_pages; + mutex_unlock(&c->bucket_lock); + return nr; +} + +void bch_btree_cache_free(struct cache_set *c) +{ + struct btree *b; + struct closure cl; + closure_init_stack(&cl); + + if (c->shrink.list.next) + unregister_shrinker(&c->shrink); + + mutex_lock(&c->bucket_lock); + +#ifdef CONFIG_BCACHE_DEBUG + if (c->verify_data) + list_move(&c->verify_data->list, &c->btree_cache); +#endif + + list_splice(&c->btree_cache_freeable, + &c->btree_cache); + + while (!list_empty(&c->btree_cache)) { + b = list_first_entry(&c->btree_cache, struct btree, list); + + if (btree_node_dirty(b)) + btree_complete_write(b, btree_current_write(b)); + clear_bit(BTREE_NODE_dirty, &b->flags); + + mca_data_free(b); + } + + while (!list_empty(&c->btree_cache_freed)) { + b = list_first_entry(&c->btree_cache_freed, + struct btree, list); + list_del(&b->list); + cancel_delayed_work_sync(&b->work); + kfree(b); + } + + mutex_unlock(&c->bucket_lock); +} + +int bch_btree_cache_alloc(struct cache_set *c) +{ + unsigned i; + + /* XXX: doesn't check for errors */ + + closure_init_unlocked(&c->gc); + + for (i = 0; i < mca_reserve(c); i++) + mca_bucket_alloc(c, &ZERO_KEY, GFP_KERNEL); + + list_splice_init(&c->btree_cache, + &c->btree_cache_freeable); + +#ifdef CONFIG_BCACHE_DEBUG + mutex_init(&c->verify_lock); + + c->verify_data = mca_bucket_alloc(c, &ZERO_KEY, GFP_KERNEL); + + if (c->verify_data && + c->verify_data->sets[0].data) + list_del_init(&c->verify_data->list); + else + c->verify_data = NULL; +#endif + + c->shrink.shrink = bch_mca_shrink; + c->shrink.seeks = 4; + c->shrink.batch = c->btree_pages * 2; + register_shrinker(&c->shrink); + + return 0; +} + +/* Btree in memory cache - hash table */ + +static struct hlist_head *mca_hash(struct cache_set *c, struct bkey *k) +{ + return &c->bucket_hash[hash_32(PTR_HASH(c, k), BUCKET_HASH_BITS)]; +} + +static struct btree *mca_find(struct cache_set *c, struct bkey *k) +{ + struct btree *b; + + rcu_read_lock(); + hlist_for_each_entry_rcu(b, mca_hash(c, k), hash) + if (PTR_HASH(c, &b->key) == PTR_HASH(c, k)) + goto out; + b = NULL; +out: + rcu_read_unlock(); + return b; +} + +static struct btree *mca_cannibalize(struct cache_set *c, struct bkey *k, + int level, struct closure *cl) +{ + int ret = -ENOMEM; + struct btree *i; + + if (!cl) + return ERR_PTR(-ENOMEM); + + /* + * Trying to free up some memory - i.e. reuse some btree nodes - may + * require initiating IO to flush the dirty part of the node. If we're + * running under generic_make_request(), that IO will never finish and + * we would deadlock. Returning -EAGAIN causes the cache lookup code to + * punt to workqueue and retry. + */ + if (current->bio_list) + return ERR_PTR(-EAGAIN); + + if (c->try_harder && c->try_harder != cl) { + closure_wait_event_async(&c->try_wait, cl, !c->try_harder); + return ERR_PTR(-EAGAIN); + } + + /* XXX: tracepoint */ + c->try_harder = cl; + c->try_harder_start = local_clock(); +retry: + list_for_each_entry_reverse(i, &c->btree_cache, list) { + int r = mca_reap(i, cl, btree_order(k)); + if (!r) + return i; + if (r != -ENOMEM) + ret = r; + } + + if (ret == -EAGAIN && + closure_blocking(cl)) { + mutex_unlock(&c->bucket_lock); + closure_sync(cl); + mutex_lock(&c->bucket_lock); + goto retry; + } + + return ERR_PTR(ret); +} + +/* + * We can only have one thread cannibalizing other cached btree nodes at a time, + * or we'll deadlock. We use an open coded mutex to ensure that, which a + * cannibalize_bucket() will take. This means every time we unlock the root of + * the btree, we need to release this lock if we have it held. + */ +void bch_cannibalize_unlock(struct cache_set *c, struct closure *cl) +{ + if (c->try_harder == cl) { + bch_time_stats_update(&c->try_harder_time, c->try_harder_start); + c->try_harder = NULL; + __closure_wake_up(&c->try_wait); + } +} + +static struct btree *mca_alloc(struct cache_set *c, struct bkey *k, + int level, struct closure *cl) +{ + struct btree *b; + + lockdep_assert_held(&c->bucket_lock); + + if (mca_find(c, k)) + return NULL; + + /* btree_free() doesn't free memory; it sticks the node on the end of + * the list. Check if there's any freed nodes there: + */ + list_for_each_entry(b, &c->btree_cache_freeable, list) + if (!mca_reap(b, NULL, btree_order(k))) + goto out; + + /* We never free struct btree itself, just the memory that holds the on + * disk node. Check the freed list before allocating a new one: + */ + list_for_each_entry(b, &c->btree_cache_freed, list) + if (!mca_reap(b, NULL, 0)) { + mca_data_alloc(b, k, __GFP_NOWARN|GFP_NOIO); + if (!b->sets[0].data) + goto err; + else + goto out; + } + + b = mca_bucket_alloc(c, k, __GFP_NOWARN|GFP_NOIO); + if (!b) + goto err; + + BUG_ON(!down_write_trylock(&b->lock)); + if (!b->sets->data) + goto err; +out: + BUG_ON(!closure_is_unlocked(&b->io.cl)); + + bkey_copy(&b->key, k); + list_move(&b->list, &c->btree_cache); + hlist_del_init_rcu(&b->hash); + hlist_add_head_rcu(&b->hash, mca_hash(c, k)); + + lock_set_subclass(&b->lock.dep_map, level + 1, _THIS_IP_); + b->level = level; + + mca_reinit(b); + + return b; +err: + if (b) + rw_unlock(true, b); + + b = mca_cannibalize(c, k, level, cl); + if (!IS_ERR(b)) + goto out; + + return b; +} + +/** + * bch_btree_node_get - find a btree node in the cache and lock it, reading it + * in from disk if necessary. + * + * If IO is necessary, it uses the closure embedded in struct btree_op to wait; + * if that closure is in non blocking mode, will return -EAGAIN. + * + * The btree node will have either a read or a write lock held, depending on + * level and op->lock. + */ +struct btree *bch_btree_node_get(struct cache_set *c, struct bkey *k, + int level, struct btree_op *op) +{ + int i = 0; + bool write = level <= op->lock; + struct btree *b; + + BUG_ON(level < 0); +retry: + b = mca_find(c, k); + + if (!b) { + mutex_lock(&c->bucket_lock); + b = mca_alloc(c, k, level, &op->cl); + mutex_unlock(&c->bucket_lock); + + if (!b) + goto retry; + if (IS_ERR(b)) + return b; + + bch_btree_read(b); + + if (!write) + downgrade_write(&b->lock); + } else { + rw_lock(write, b, level); + if (PTR_HASH(c, &b->key) != PTR_HASH(c, k)) { + rw_unlock(write, b); + goto retry; + } + BUG_ON(b->level != level); + } + + b->accessed = 1; + + for (; i <= b->nsets && b->sets[i].size; i++) { + prefetch(b->sets[i].tree); + prefetch(b->sets[i].data); + } + + for (; i <= b->nsets; i++) + prefetch(b->sets[i].data); + + if (!closure_wait_event(&b->io.wait, &op->cl, + btree_node_read_done(b))) { + rw_unlock(write, b); + b = ERR_PTR(-EAGAIN); + } else if (btree_node_io_error(b)) { + rw_unlock(write, b); + b = ERR_PTR(-EIO); + } else + BUG_ON(!b->written); + + return b; +} + +static void btree_node_prefetch(struct cache_set *c, struct bkey *k, int level) +{ + struct btree *b; + + mutex_lock(&c->bucket_lock); + b = mca_alloc(c, k, level, NULL); + mutex_unlock(&c->bucket_lock); + + if (!IS_ERR_OR_NULL(b)) { + bch_btree_read(b); + rw_unlock(true, b); + } +} + +/* Btree alloc */ + +static void btree_node_free(struct btree *b, struct btree_op *op) +{ + unsigned i; + + /* + * The BUG_ON() in btree_node_get() implies that we must have a write + * lock on parent to free or even invalidate a node + */ + BUG_ON(op->lock <= b->level); + BUG_ON(b == b->c->root); + pr_debug("bucket %s", pbtree(b)); + + if (btree_node_dirty(b)) + btree_complete_write(b, btree_current_write(b)); + clear_bit(BTREE_NODE_dirty, &b->flags); + + if (b->prio_blocked && + !atomic_sub_return(b->prio_blocked, &b->c->prio_blocked)) + wake_up(&b->c->alloc_wait); + + b->prio_blocked = 0; + + cancel_delayed_work(&b->work); + + mutex_lock(&b->c->bucket_lock); + + for (i = 0; i < KEY_PTRS(&b->key); i++) { + BUG_ON(atomic_read(&PTR_BUCKET(b->c, &b->key, i)->pin)); + + bch_inc_gen(PTR_CACHE(b->c, &b->key, i), + PTR_BUCKET(b->c, &b->key, i)); + } + + bch_bucket_free(b->c, &b->key); + mca_bucket_free(b); + mutex_unlock(&b->c->bucket_lock); +} + +struct btree *bch_btree_node_alloc(struct cache_set *c, int level, + struct closure *cl) +{ + BKEY_PADDED(key) k; + struct btree *b = ERR_PTR(-EAGAIN); + + mutex_lock(&c->bucket_lock); +retry: + if (__bch_bucket_alloc_set(c, WATERMARK_METADATA, &k.key, 1, cl)) + goto err; + + SET_KEY_SIZE(&k.key, c->btree_pages * PAGE_SECTORS); + + b = mca_alloc(c, &k.key, level, cl); + if (IS_ERR(b)) + goto err_free; + + if (!b) { + cache_bug(c, + "Tried to allocate bucket that was in btree cache"); + __bkey_put(c, &k.key); + goto retry; + } + + set_btree_node_read_done(b); + b->accessed = 1; + bch_bset_init_next(b); + + mutex_unlock(&c->bucket_lock); + return b; +err_free: + bch_bucket_free(c, &k.key); + __bkey_put(c, &k.key); +err: + mutex_unlock(&c->bucket_lock); + return b; +} + +static struct btree *btree_node_alloc_replacement(struct btree *b, + struct closure *cl) +{ + struct btree *n = bch_btree_node_alloc(b->c, b->level, cl); + if (!IS_ERR_OR_NULL(n)) + bch_btree_sort_into(b, n); + + return n; +} + +/* Garbage collection */ + +uint8_t __bch_btree_mark_key(struct cache_set *c, int level, struct bkey *k) +{ + uint8_t stale = 0; + unsigned i; + struct bucket *g; + + /* + * ptr_invalid() can't return true for the keys that mark btree nodes as + * freed, but since ptr_bad() returns true we'll never actually use them + * for anything and thus we don't want mark their pointers here + */ + if (!bkey_cmp(k, &ZERO_KEY)) + return stale; + + for (i = 0; i < KEY_PTRS(k); i++) { + if (!ptr_available(c, k, i)) + continue; + + g = PTR_BUCKET(c, k, i); + + if (gen_after(g->gc_gen, PTR_GEN(k, i))) + g->gc_gen = PTR_GEN(k, i); + + if (ptr_stale(c, k, i)) { + stale = max(stale, ptr_stale(c, k, i)); + continue; + } + + cache_bug_on(GC_MARK(g) && + (GC_MARK(g) == GC_MARK_METADATA) != (level != 0), + c, "inconsistent ptrs: mark = %llu, level = %i", + GC_MARK(g), level); + + if (level) + SET_GC_MARK(g, GC_MARK_METADATA); + else if (KEY_DIRTY(k)) + SET_GC_MARK(g, GC_MARK_DIRTY); + + /* guard against overflow */ + SET_GC_SECTORS_USED(g, min_t(unsigned, + GC_SECTORS_USED(g) + KEY_SIZE(k), + (1 << 14) - 1)); + + BUG_ON(!GC_SECTORS_USED(g)); + } + + return stale; +} + +#define btree_mark_key(b, k) __bch_btree_mark_key(b->c, b->level, k) + +static int btree_gc_mark_node(struct btree *b, unsigned *keys, + struct gc_stat *gc) +{ + uint8_t stale = 0; + unsigned last_dev = -1; + struct bcache_device *d = NULL; + struct bkey *k; + struct btree_iter iter; + struct bset_tree *t; + + gc->nodes++; + + for_each_key_filter(b, k, &iter, bch_ptr_invalid) { + if (last_dev != KEY_INODE(k)) { + last_dev = KEY_INODE(k); + + d = KEY_INODE(k) < b->c->nr_uuids + ? b->c->devices[last_dev] + : NULL; + } + + stale = max(stale, btree_mark_key(b, k)); + + if (bch_ptr_bad(b, k)) + continue; + + *keys += bkey_u64s(k); + + gc->key_bytes += bkey_u64s(k); + gc->nkeys++; + + gc->data += KEY_SIZE(k); + if (KEY_DIRTY(k)) { + gc->dirty += KEY_SIZE(k); + if (d) + d->sectors_dirty_gc += KEY_SIZE(k); + } + } + + for (t = b->sets; t <= &b->sets[b->nsets]; t++) + btree_bug_on(t->size && + bset_written(b, t) && + bkey_cmp(&b->key, &t->end) < 0, + b, "found short btree key in gc"); + + return stale; +} + +static struct btree *btree_gc_alloc(struct btree *b, struct bkey *k, + struct btree_op *op) +{ + /* + * We block priorities from being written for the duration of garbage + * collection, so we can't sleep in btree_alloc() -> + * bch_bucket_alloc_set(), or we'd risk deadlock - so we don't pass it + * our closure. + */ + struct btree *n = btree_node_alloc_replacement(b, NULL); + + if (!IS_ERR_OR_NULL(n)) { + swap(b, n); + + memcpy(k->ptr, b->key.ptr, + sizeof(uint64_t) * KEY_PTRS(&b->key)); + + __bkey_put(b->c, &b->key); + atomic_inc(&b->c->prio_blocked); + b->prio_blocked++; + + btree_node_free(n, op); + up_write(&n->lock); + } + + return b; +} + +/* + * Leaving this at 2 until we've got incremental garbage collection done; it + * could be higher (and has been tested with 4) except that garbage collection + * could take much longer, adversely affecting latency. + */ +#define GC_MERGE_NODES 2U + +struct gc_merge_info { + struct btree *b; + struct bkey *k; + unsigned keys; +}; + +static void btree_gc_coalesce(struct btree *b, struct btree_op *op, + struct gc_stat *gc, struct gc_merge_info *r) +{ + unsigned nodes = 0, keys = 0, blocks; + int i; + + while (nodes < GC_MERGE_NODES && r[nodes].b) + keys += r[nodes++].keys; + + blocks = btree_default_blocks(b->c) * 2 / 3; + + if (nodes < 2 || + __set_blocks(b->sets[0].data, keys, b->c) > blocks * (nodes - 1)) + return; + + for (i = nodes - 1; i >= 0; --i) { + if (r[i].b->written) + r[i].b = btree_gc_alloc(r[i].b, r[i].k, op); + + if (r[i].b->written) + return; + } + + for (i = nodes - 1; i > 0; --i) { + struct bset *n1 = r[i].b->sets->data; + struct bset *n2 = r[i - 1].b->sets->data; + struct bkey *k, *last = NULL; + + keys = 0; + + if (i == 1) { + /* + * Last node we're not getting rid of - we're getting + * rid of the node at r[0]. Have to try and fit all of + * the remaining keys into this node; we can't ensure + * they will always fit due to rounding and variable + * length keys (shouldn't be possible in practice, + * though) + */ + if (__set_blocks(n1, n1->keys + r->keys, + b->c) > btree_blocks(r[i].b)) + return; + + keys = n2->keys; + last = &r->b->key; + } else + for (k = n2->start; + k < end(n2); + k = bkey_next(k)) { + if (__set_blocks(n1, n1->keys + keys + + bkey_u64s(k), b->c) > blocks) + break; + + last = k; + keys += bkey_u64s(k); + } + + BUG_ON(__set_blocks(n1, n1->keys + keys, + b->c) > btree_blocks(r[i].b)); + + if (last) { + bkey_copy_key(&r[i].b->key, last); + bkey_copy_key(r[i].k, last); + } + + memcpy(end(n1), + n2->start, + (void *) node(n2, keys) - (void *) n2->start); + + n1->keys += keys; + + memmove(n2->start, + node(n2, keys), + (void *) end(n2) - (void *) node(n2, keys)); + + n2->keys -= keys; + + r[i].keys = n1->keys; + r[i - 1].keys = n2->keys; + } + + btree_node_free(r->b, op); + up_write(&r->b->lock); + + pr_debug("coalesced %u nodes", nodes); + + gc->nodes--; + nodes--; + + memmove(&r[0], &r[1], sizeof(struct gc_merge_info) * nodes); + memset(&r[nodes], 0, sizeof(struct gc_merge_info)); +} + +static int btree_gc_recurse(struct btree *b, struct btree_op *op, + struct closure *writes, struct gc_stat *gc) +{ + void write(struct btree *r) + { + if (!r->written) + bch_btree_write(r, true, op); + else if (btree_node_dirty(r)) { + BUG_ON(btree_current_write(r)->owner); + btree_current_write(r)->owner = writes; + closure_get(writes); + + bch_btree_write(r, true, NULL); + } + + up_write(&r->lock); + } + + int ret = 0, stale; + unsigned i; + struct gc_merge_info r[GC_MERGE_NODES]; + + memset(r, 0, sizeof(r)); + + while ((r->k = bch_next_recurse_key(b, &b->c->gc_done))) { + r->b = bch_btree_node_get(b->c, r->k, b->level - 1, op); + + if (IS_ERR(r->b)) { + ret = PTR_ERR(r->b); + break; + } + + r->keys = 0; + stale = btree_gc_mark_node(r->b, &r->keys, gc); + + if (!b->written && + (r->b->level || stale > 10 || + b->c->gc_always_rewrite)) + r->b = btree_gc_alloc(r->b, r->k, op); + + if (r->b->level) + ret = btree_gc_recurse(r->b, op, writes, gc); + + if (ret) { + write(r->b); + break; + } + + bkey_copy_key(&b->c->gc_done, r->k); + + if (!b->written) + btree_gc_coalesce(b, op, gc, r); + + if (r[GC_MERGE_NODES - 1].b) + write(r[GC_MERGE_NODES - 1].b); + + memmove(&r[1], &r[0], + sizeof(struct gc_merge_info) * (GC_MERGE_NODES - 1)); + + /* When we've got incremental GC working, we'll want to do + * if (should_resched()) + * return -EAGAIN; + */ + cond_resched(); +#if 0 + if (need_resched()) { + ret = -EAGAIN; + break; + } +#endif + } + + for (i = 1; i < GC_MERGE_NODES && r[i].b; i++) + write(r[i].b); + + /* Might have freed some children, must remove their keys */ + if (!b->written) + bch_btree_sort(b); + + return ret; +} + +static int bch_btree_gc_root(struct btree *b, struct btree_op *op, + struct closure *writes, struct gc_stat *gc) +{ + struct btree *n = NULL; + unsigned keys = 0; + int ret = 0, stale = btree_gc_mark_node(b, &keys, gc); + + if (b->level || stale > 10) + n = btree_node_alloc_replacement(b, NULL); + + if (!IS_ERR_OR_NULL(n)) + swap(b, n); + + if (b->level) + ret = btree_gc_recurse(b, op, writes, gc); + + if (!b->written || btree_node_dirty(b)) { + atomic_inc(&b->c->prio_blocked); + b->prio_blocked++; + bch_btree_write(b, true, n ? op : NULL); + } + + if (!IS_ERR_OR_NULL(n)) { + closure_sync(&op->cl); + bch_btree_set_root(b); + btree_node_free(n, op); + rw_unlock(true, b); + } + + return ret; +} + +static void btree_gc_start(struct cache_set *c) +{ + struct cache *ca; + struct bucket *b; + struct bcache_device **d; + unsigned i; + + if (!c->gc_mark_valid) + return; + + mutex_lock(&c->bucket_lock); + + c->gc_mark_valid = 0; + c->gc_done = ZERO_KEY; + + for_each_cache(ca, c, i) + for_each_bucket(b, ca) { + b->gc_gen = b->gen; + if (!atomic_read(&b->pin)) + SET_GC_MARK(b, GC_MARK_RECLAIMABLE); + } + + for (d = c->devices; + d < c->devices + c->nr_uuids; + d++) + if (*d) + (*d)->sectors_dirty_gc = 0; + + mutex_unlock(&c->bucket_lock); +} + +size_t bch_btree_gc_finish(struct cache_set *c) +{ + size_t available = 0; + struct bucket *b; + struct cache *ca; + struct bcache_device **d; + unsigned i; + + mutex_lock(&c->bucket_lock); + + set_gc_sectors(c); + c->gc_mark_valid = 1; + c->need_gc = 0; + + if (c->root) + for (i = 0; i < KEY_PTRS(&c->root->key); i++) + SET_GC_MARK(PTR_BUCKET(c, &c->root->key, i), + GC_MARK_METADATA); + + for (i = 0; i < KEY_PTRS(&c->uuid_bucket); i++) + SET_GC_MARK(PTR_BUCKET(c, &c->uuid_bucket, i), + GC_MARK_METADATA); + + for_each_cache(ca, c, i) { + uint64_t *i; + + ca->invalidate_needs_gc = 0; + + for (i = ca->sb.d; i < ca->sb.d + ca->sb.keys; i++) + SET_GC_MARK(ca->buckets + *i, GC_MARK_METADATA); + + for (i = ca->prio_buckets; + i < ca->prio_buckets + prio_buckets(ca) * 2; i++) + SET_GC_MARK(ca->buckets + *i, GC_MARK_METADATA); + + for_each_bucket(b, ca) { + b->last_gc = b->gc_gen; + c->need_gc = max(c->need_gc, bucket_gc_gen(b)); + + if (!atomic_read(&b->pin) && + GC_MARK(b) == GC_MARK_RECLAIMABLE) { + available++; + if (!GC_SECTORS_USED(b)) + bch_bucket_add_unused(ca, b); + } + } + } + + for (d = c->devices; + d < c->devices + c->nr_uuids; + d++) + if (*d) { + unsigned long last = + atomic_long_read(&((*d)->sectors_dirty)); + long difference = (*d)->sectors_dirty_gc - last; + + pr_debug("sectors dirty off by %li", difference); + + (*d)->sectors_dirty_last += difference; + + atomic_long_set(&((*d)->sectors_dirty), + (*d)->sectors_dirty_gc); + } + + mutex_unlock(&c->bucket_lock); + return available; +} + +static void bch_btree_gc(struct closure *cl) +{ + struct cache_set *c = container_of(cl, struct cache_set, gc.cl); + int ret; + unsigned long available; + struct gc_stat stats; + struct closure writes; + struct btree_op op; + + uint64_t start_time = local_clock(); + trace_bcache_gc_start(c->sb.set_uuid); + blktrace_msg_all(c, "Starting gc"); + + memset(&stats, 0, sizeof(struct gc_stat)); + closure_init_stack(&writes); + bch_btree_op_init_stack(&op); + op.lock = SHRT_MAX; + + btree_gc_start(c); + + ret = btree_root(gc_root, c, &op, &writes, &stats); + closure_sync(&op.cl); + closure_sync(&writes); + + if (ret) { + blktrace_msg_all(c, "Stopped gc"); + pr_warn("gc failed!"); + + continue_at(cl, bch_btree_gc, bch_gc_wq); + } + + /* Possibly wait for new UUIDs or whatever to hit disk */ + bch_journal_meta(c, &op.cl); + closure_sync(&op.cl); + + available = bch_btree_gc_finish(c); + + bch_time_stats_update(&c->btree_gc_time, start_time); + + stats.key_bytes *= sizeof(uint64_t); + stats.dirty <<= 9; + stats.data <<= 9; + stats.in_use = (c->nbuckets - available) * 100 / c->nbuckets; + memcpy(&c->gc_stats, &stats, sizeof(struct gc_stat)); + blktrace_msg_all(c, "Finished gc"); + + trace_bcache_gc_end(c->sb.set_uuid); + wake_up(&c->alloc_wait); + + continue_at(cl, bch_moving_gc, bch_gc_wq); +} + +void bch_queue_gc(struct cache_set *c) +{ + closure_trylock_call(&c->gc.cl, bch_btree_gc, bch_gc_wq, &c->cl); +} + +/* Initial partial gc */ + +static int bch_btree_check_recurse(struct btree *b, struct btree_op *op, + unsigned long **seen) +{ + int ret; + unsigned i; + struct bkey *k; + struct bucket *g; + struct btree_iter iter; + + for_each_key_filter(b, k, &iter, bch_ptr_invalid) { + for (i = 0; i < KEY_PTRS(k); i++) { + if (!ptr_available(b->c, k, i)) + continue; + + g = PTR_BUCKET(b->c, k, i); + + if (!__test_and_set_bit(PTR_BUCKET_NR(b->c, k, i), + seen[PTR_DEV(k, i)]) || + !ptr_stale(b->c, k, i)) { + g->gen = PTR_GEN(k, i); + + if (b->level) + g->prio = BTREE_PRIO; + else if (g->prio == BTREE_PRIO) + g->prio = INITIAL_PRIO; + } + } + + btree_mark_key(b, k); + } + + if (b->level) { + k = bch_next_recurse_key(b, &ZERO_KEY); + + while (k) { + struct bkey *p = bch_next_recurse_key(b, k); + if (p) + btree_node_prefetch(b->c, p, b->level - 1); + + ret = btree(check_recurse, k, b, op, seen); + if (ret) + return ret; + + k = p; + } + } + + return 0; +} + +int bch_btree_check(struct cache_set *c, struct btree_op *op) +{ + int ret = -ENOMEM; + unsigned i; + unsigned long *seen[MAX_CACHES_PER_SET]; + + memset(seen, 0, sizeof(seen)); + + for (i = 0; c->cache[i]; i++) { + size_t n = DIV_ROUND_UP(c->cache[i]->sb.nbuckets, 8); + seen[i] = kmalloc(n, GFP_KERNEL); + if (!seen[i]) + goto err; + + /* Disables the seen array until prio_read() uses it too */ + memset(seen[i], 0xFF, n); + } + + ret = btree_root(check_recurse, c, op, seen); +err: + for (i = 0; i < MAX_CACHES_PER_SET; i++) + kfree(seen[i]); + return ret; +} + +/* Btree insertion */ + +static void shift_keys(struct btree *b, struct bkey *where, struct bkey *insert) +{ + struct bset *i = b->sets[b->nsets].data; + + memmove((uint64_t *) where + bkey_u64s(insert), + where, + (void *) end(i) - (void *) where); + + i->keys += bkey_u64s(insert); + bkey_copy(where, insert); + bch_bset_fix_lookup_table(b, where); +} + +static bool fix_overlapping_extents(struct btree *b, + struct bkey *insert, + struct btree_iter *iter, + struct btree_op *op) +{ + void subtract_dirty(struct bkey *k, int sectors) + { + struct bcache_device *d = b->c->devices[KEY_INODE(k)]; + + if (KEY_DIRTY(k) && d) + atomic_long_sub(sectors, &d->sectors_dirty); + } + + unsigned old_size, sectors_found = 0; + + while (1) { + struct bkey *k = bch_btree_iter_next(iter); + if (!k || + bkey_cmp(&START_KEY(k), insert) >= 0) + break; + + if (bkey_cmp(k, &START_KEY(insert)) <= 0) + continue; + + old_size = KEY_SIZE(k); + + /* + * We might overlap with 0 size extents; we can't skip these + * because if they're in the set we're inserting to we have to + * adjust them so they don't overlap with the key we're + * inserting. But we don't want to check them for BTREE_REPLACE + * operations. + */ + + if (op->type == BTREE_REPLACE && + KEY_SIZE(k)) { + /* + * k might have been split since we inserted/found the + * key we're replacing + */ + unsigned i; + uint64_t offset = KEY_START(k) - + KEY_START(&op->replace); + + /* But it must be a subset of the replace key */ + if (KEY_START(k) < KEY_START(&op->replace) || + KEY_OFFSET(k) > KEY_OFFSET(&op->replace)) + goto check_failed; + + /* We didn't find a key that we were supposed to */ + if (KEY_START(k) > KEY_START(insert) + sectors_found) + goto check_failed; + + if (KEY_PTRS(&op->replace) != KEY_PTRS(k)) + goto check_failed; + + /* skip past gen */ + offset <<= 8; + + BUG_ON(!KEY_PTRS(&op->replace)); + + for (i = 0; i < KEY_PTRS(&op->replace); i++) + if (k->ptr[i] != op->replace.ptr[i] + offset) + goto check_failed; + + sectors_found = KEY_OFFSET(k) - KEY_START(insert); + } + + if (bkey_cmp(insert, k) < 0 && + bkey_cmp(&START_KEY(insert), &START_KEY(k)) > 0) { + /* + * We overlapped in the middle of an existing key: that + * means we have to split the old key. But we have to do + * slightly different things depending on whether the + * old key has been written out yet. + */ + + struct bkey *top; + + subtract_dirty(k, KEY_SIZE(insert)); + + if (bkey_written(b, k)) { + /* + * We insert a new key to cover the top of the + * old key, and the old key is modified in place + * to represent the bottom split. + * + * It's completely arbitrary whether the new key + * is the top or the bottom, but it has to match + * up with what btree_sort_fixup() does - it + * doesn't check for this kind of overlap, it + * depends on us inserting a new key for the top + * here. + */ + top = bch_bset_search(b, &b->sets[b->nsets], + insert); + shift_keys(b, top, k); + } else { + BKEY_PADDED(key) temp; + bkey_copy(&temp.key, k); + shift_keys(b, k, &temp.key); + top = bkey_next(k); + } + + bch_cut_front(insert, top); + bch_cut_back(&START_KEY(insert), k); + bch_bset_fix_invalidated_key(b, k); + return false; + } + + if (bkey_cmp(insert, k) < 0) { + bch_cut_front(insert, k); + } else { + if (bkey_written(b, k) && + bkey_cmp(&START_KEY(insert), &START_KEY(k)) <= 0) { + /* + * Completely overwrote, so we don't have to + * invalidate the binary search tree + */ + bch_cut_front(k, k); + } else { + __bch_cut_back(&START_KEY(insert), k); + bch_bset_fix_invalidated_key(b, k); + } + } + + subtract_dirty(k, old_size - KEY_SIZE(k)); + } + +check_failed: + if (op->type == BTREE_REPLACE) { + if (!sectors_found) { + op->insert_collision = true; + return true; + } else if (sectors_found < KEY_SIZE(insert)) { + SET_KEY_OFFSET(insert, KEY_OFFSET(insert) - + (KEY_SIZE(insert) - sectors_found)); + SET_KEY_SIZE(insert, sectors_found); + } + } + + return false; +} + +static bool btree_insert_key(struct btree *b, struct btree_op *op, + struct bkey *k) +{ + struct bset *i = b->sets[b->nsets].data; + struct bkey *m, *prev; + const char *status = "insert"; + + BUG_ON(bkey_cmp(k, &b->key) > 0); + BUG_ON(b->level && !KEY_PTRS(k)); + BUG_ON(!b->level && !KEY_OFFSET(k)); + + if (!b->level) { + struct btree_iter iter; + struct bkey search = KEY(KEY_INODE(k), KEY_START(k), 0); + + /* + * bset_search() returns the first key that is strictly greater + * than the search key - but for back merging, we want to find + * the first key that is greater than or equal to KEY_START(k) - + * unless KEY_START(k) is 0. + */ + if (KEY_OFFSET(&search)) + SET_KEY_OFFSET(&search, KEY_OFFSET(&search) - 1); + + prev = NULL; + m = bch_btree_iter_init(b, &iter, &search); + + if (fix_overlapping_extents(b, k, &iter, op)) + return false; + + while (m != end(i) && + bkey_cmp(k, &START_KEY(m)) > 0) + prev = m, m = bkey_next(m); + + if (key_merging_disabled(b->c)) + goto insert; + + /* prev is in the tree, if we merge we're done */ + status = "back merging"; + if (prev && + bch_bkey_try_merge(b, prev, k)) + goto merged; + + status = "overwrote front"; + if (m != end(i) && + KEY_PTRS(m) == KEY_PTRS(k) && !KEY_SIZE(m)) + goto copy; + + status = "front merge"; + if (m != end(i) && + bch_bkey_try_merge(b, k, m)) + goto copy; + } else + m = bch_bset_search(b, &b->sets[b->nsets], k); + +insert: shift_keys(b, m, k); +copy: bkey_copy(m, k); +merged: + bch_check_keys(b, "%s for %s at %s: %s", status, + op_type(op), pbtree(b), pkey(k)); + bch_check_key_order_msg(b, i, "%s for %s at %s: %s", status, + op_type(op), pbtree(b), pkey(k)); + + if (b->level && !KEY_OFFSET(k)) + b->prio_blocked++; + + pr_debug("%s for %s at %s: %s", status, + op_type(op), pbtree(b), pkey(k)); + + return true; +} + +bool bch_btree_insert_keys(struct btree *b, struct btree_op *op) +{ + bool ret = false; + struct bkey *k; + unsigned oldsize = bch_count_data(b); + + while ((k = bch_keylist_pop(&op->keys))) { + bkey_put(b->c, k, b->level); + ret |= btree_insert_key(b, op, k); + } + + BUG_ON(bch_count_data(b) < oldsize); + return ret; +} + +bool bch_btree_insert_check_key(struct btree *b, struct btree_op *op, + struct bio *bio) +{ + bool ret = false; + uint64_t btree_ptr = b->key.ptr[0]; + unsigned long seq = b->seq; + BKEY_PADDED(k) tmp; + + rw_unlock(false, b); + rw_lock(true, b, b->level); + + if (b->key.ptr[0] != btree_ptr || + b->seq != seq + 1 || + should_split(b)) + goto out; + + op->replace = KEY(op->inode, bio_end(bio), bio_sectors(bio)); + + SET_KEY_PTRS(&op->replace, 1); + get_random_bytes(&op->replace.ptr[0], sizeof(uint64_t)); + + SET_PTR_DEV(&op->replace, 0, PTR_CHECK_DEV); + + bkey_copy(&tmp.k, &op->replace); + + BUG_ON(op->type != BTREE_INSERT); + BUG_ON(!btree_insert_key(b, op, &tmp.k)); + bch_btree_write(b, false, NULL); + ret = true; +out: + downgrade_write(&b->lock); + return ret; +} + +static int btree_split(struct btree *b, struct btree_op *op) +{ + bool split, root = b == b->c->root; + struct btree *n1, *n2 = NULL, *n3 = NULL; + uint64_t start_time = local_clock(); + + if (b->level) + set_closure_blocking(&op->cl); + + n1 = btree_node_alloc_replacement(b, &op->cl); + if (IS_ERR(n1)) + goto err; + + split = set_blocks(n1->sets[0].data, n1->c) > (btree_blocks(b) * 4) / 5; + + pr_debug("%ssplitting at %s keys %i", split ? "" : "not ", + pbtree(b), n1->sets[0].data->keys); + + if (split) { + unsigned keys = 0; + + n2 = bch_btree_node_alloc(b->c, b->level, &op->cl); + if (IS_ERR(n2)) + goto err_free1; + + if (root) { + n3 = bch_btree_node_alloc(b->c, b->level + 1, &op->cl); + if (IS_ERR(n3)) + goto err_free2; + } + + bch_btree_insert_keys(n1, op); + + /* Has to be a linear search because we don't have an auxiliary + * search tree yet + */ + + while (keys < (n1->sets[0].data->keys * 3) / 5) + keys += bkey_u64s(node(n1->sets[0].data, keys)); + + bkey_copy_key(&n1->key, node(n1->sets[0].data, keys)); + keys += bkey_u64s(node(n1->sets[0].data, keys)); + + n2->sets[0].data->keys = n1->sets[0].data->keys - keys; + n1->sets[0].data->keys = keys; + + memcpy(n2->sets[0].data->start, + end(n1->sets[0].data), + n2->sets[0].data->keys * sizeof(uint64_t)); + + bkey_copy_key(&n2->key, &b->key); + + bch_keylist_add(&op->keys, &n2->key); + bch_btree_write(n2, true, op); + rw_unlock(true, n2); + } else + bch_btree_insert_keys(n1, op); + + bch_keylist_add(&op->keys, &n1->key); + bch_btree_write(n1, true, op); + + if (n3) { + bkey_copy_key(&n3->key, &MAX_KEY); + bch_btree_insert_keys(n3, op); + bch_btree_write(n3, true, op); + + closure_sync(&op->cl); + bch_btree_set_root(n3); + rw_unlock(true, n3); + } else if (root) { + op->keys.top = op->keys.bottom; + closure_sync(&op->cl); + bch_btree_set_root(n1); + } else { + unsigned i; + + bkey_copy(op->keys.top, &b->key); + bkey_copy_key(op->keys.top, &ZERO_KEY); + + for (i = 0; i < KEY_PTRS(&b->key); i++) { + uint8_t g = PTR_BUCKET(b->c, &b->key, i)->gen + 1; + + SET_PTR_GEN(op->keys.top, i, g); + } + + bch_keylist_push(&op->keys); + closure_sync(&op->cl); + atomic_inc(&b->c->prio_blocked); + } + + rw_unlock(true, n1); + btree_node_free(b, op); + + bch_time_stats_update(&b->c->btree_split_time, start_time); + + return 0; +err_free2: + __bkey_put(n2->c, &n2->key); + btree_node_free(n2, op); + rw_unlock(true, n2); +err_free1: + __bkey_put(n1->c, &n1->key); + btree_node_free(n1, op); + rw_unlock(true, n1); +err: + if (n3 == ERR_PTR(-EAGAIN) || + n2 == ERR_PTR(-EAGAIN) || + n1 == ERR_PTR(-EAGAIN)) + return -EAGAIN; + + pr_warn("couldn't split"); + return -ENOMEM; +} + +static int bch_btree_insert_recurse(struct btree *b, struct btree_op *op, + struct keylist *stack_keys) +{ + if (b->level) { + int ret; + struct bkey *insert = op->keys.bottom; + struct bkey *k = bch_next_recurse_key(b, &START_KEY(insert)); + + if (!k) { + btree_bug(b, "no key to recurse on at level %i/%i", + b->level, b->c->root->level); + + op->keys.top = op->keys.bottom; + return -EIO; + } + + if (bkey_cmp(insert, k) > 0) { + unsigned i; + + if (op->type == BTREE_REPLACE) { + __bkey_put(b->c, insert); + op->keys.top = op->keys.bottom; + op->insert_collision = true; + return 0; + } + + for (i = 0; i < KEY_PTRS(insert); i++) + atomic_inc(&PTR_BUCKET(b->c, insert, i)->pin); + + bkey_copy(stack_keys->top, insert); + + bch_cut_back(k, insert); + bch_cut_front(k, stack_keys->top); + + bch_keylist_push(stack_keys); + } + + ret = btree(insert_recurse, k, b, op, stack_keys); + if (ret) + return ret; + } + + if (!bch_keylist_empty(&op->keys)) { + if (should_split(b)) { + if (op->lock <= b->c->root->level) { + BUG_ON(b->level); + op->lock = b->c->root->level + 1; + return -EINTR; + } + return btree_split(b, op); + } + + BUG_ON(write_block(b) != b->sets[b->nsets].data); + + if (bch_btree_insert_keys(b, op)) + bch_btree_write(b, false, op); + } + + return 0; +} + +int bch_btree_insert(struct btree_op *op, struct cache_set *c) +{ + int ret = 0; + struct keylist stack_keys; + + /* + * Don't want to block with the btree locked unless we have to, + * otherwise we get deadlocks with try_harder and between split/gc + */ + clear_closure_blocking(&op->cl); + + BUG_ON(bch_keylist_empty(&op->keys)); + bch_keylist_copy(&stack_keys, &op->keys); + bch_keylist_init(&op->keys); + + while (!bch_keylist_empty(&stack_keys) || + !bch_keylist_empty(&op->keys)) { + if (bch_keylist_empty(&op->keys)) { + bch_keylist_add(&op->keys, + bch_keylist_pop(&stack_keys)); + op->lock = 0; + } + + ret = btree_root(insert_recurse, c, op, &stack_keys); + + if (ret == -EAGAIN) { + ret = 0; + closure_sync(&op->cl); + } else if (ret) { + struct bkey *k; + + pr_err("error %i trying to insert key for %s", + ret, op_type(op)); + + while ((k = bch_keylist_pop(&stack_keys) ?: + bch_keylist_pop(&op->keys))) + bkey_put(c, k, 0); + } + } + + bch_keylist_free(&stack_keys); + + if (op->journal) + atomic_dec_bug(op->journal); + op->journal = NULL; + return ret; +} + +void bch_btree_set_root(struct btree *b) +{ + unsigned i; + + BUG_ON(!b->written); + + for (i = 0; i < KEY_PTRS(&b->key); i++) + BUG_ON(PTR_BUCKET(b->c, &b->key, i)->prio != BTREE_PRIO); + + mutex_lock(&b->c->bucket_lock); + list_del_init(&b->list); + mutex_unlock(&b->c->bucket_lock); + + b->c->root = b; + __bkey_put(b->c, &b->key); + + bch_journal_meta(b->c, NULL); + pr_debug("%s for %pf", pbtree(b), __builtin_return_address(0)); +} + +/* Cache lookup */ + +static int submit_partial_cache_miss(struct btree *b, struct btree_op *op, + struct bkey *k) +{ + struct search *s = container_of(op, struct search, op); + struct bio *bio = &s->bio.bio; + int ret = 0; + + while (!ret && + !op->lookup_done) { + unsigned sectors = INT_MAX; + + if (KEY_INODE(k) == op->inode) { + if (KEY_START(k) <= bio->bi_sector) + break; + + sectors = min_t(uint64_t, sectors, + KEY_START(k) - bio->bi_sector); + } + + ret = s->d->cache_miss(b, s, bio, sectors); + } + + return ret; +} + +/* + * Read from a single key, handling the initial cache miss if the key starts in + * the middle of the bio + */ +static int submit_partial_cache_hit(struct btree *b, struct btree_op *op, + struct bkey *k) +{ + struct search *s = container_of(op, struct search, op); + struct bio *bio = &s->bio.bio; + unsigned ptr; + struct bio *n; + + int ret = submit_partial_cache_miss(b, op, k); + if (ret || op->lookup_done) + return ret; + + /* XXX: figure out best pointer - for multiple cache devices */ + ptr = 0; + + PTR_BUCKET(b->c, k, ptr)->prio = INITIAL_PRIO; + + while (!op->lookup_done && + KEY_INODE(k) == op->inode && + bio->bi_sector < KEY_OFFSET(k)) { + struct bkey *bio_key; + sector_t sector = PTR_OFFSET(k, ptr) + + (bio->bi_sector - KEY_START(k)); + unsigned sectors = min_t(uint64_t, INT_MAX, + KEY_OFFSET(k) - bio->bi_sector); + + n = bch_bio_split(bio, sectors, GFP_NOIO, s->d->bio_split); + if (!n) + return -EAGAIN; + + if (n == bio) + op->lookup_done = true; + + bio_key = &container_of(n, struct bbio, bio)->key; + + /* + * The bucket we're reading from might be reused while our bio + * is in flight, and we could then end up reading the wrong + * data. + * + * We guard against this by checking (in cache_read_endio()) if + * the pointer is stale again; if so, we treat it as an error + * and reread from the backing device (but we don't pass that + * error up anywhere). + */ + + bch_bkey_copy_single_ptr(bio_key, k, ptr); + SET_PTR_OFFSET(bio_key, 0, sector); + + n->bi_end_io = bch_cache_read_endio; + n->bi_private = &s->cl; + + trace_bcache_cache_hit(n); + __bch_submit_bbio(n, b->c); + } + + return 0; +} + +int bch_btree_search_recurse(struct btree *b, struct btree_op *op) +{ + struct search *s = container_of(op, struct search, op); + struct bio *bio = &s->bio.bio; + + int ret = 0; + struct bkey *k; + struct btree_iter iter; + bch_btree_iter_init(b, &iter, &KEY(op->inode, bio->bi_sector, 0)); + + pr_debug("at %s searching for %u:%llu", pbtree(b), op->inode, + (uint64_t) bio->bi_sector); + + do { + k = bch_btree_iter_next_filter(&iter, b, bch_ptr_bad); + if (!k) { + /* + * b->key would be exactly what we want, except that + * pointers to btree nodes have nonzero size - we + * wouldn't go far enough + */ + + ret = submit_partial_cache_miss(b, op, + &KEY(KEY_INODE(&b->key), + KEY_OFFSET(&b->key), 0)); + break; + } + + ret = b->level + ? btree(search_recurse, k, b, op) + : submit_partial_cache_hit(b, op, k); + } while (!ret && + !op->lookup_done); + + return ret; +} + +/* Keybuf code */ + +static inline int keybuf_cmp(struct keybuf_key *l, struct keybuf_key *r) +{ + /* Overlapping keys compare equal */ + if (bkey_cmp(&l->key, &START_KEY(&r->key)) <= 0) + return -1; + if (bkey_cmp(&START_KEY(&l->key), &r->key) >= 0) + return 1; + return 0; +} + +static inline int keybuf_nonoverlapping_cmp(struct keybuf_key *l, + struct keybuf_key *r) +{ + return clamp_t(int64_t, bkey_cmp(&l->key, &r->key), -1, 1); +} + +static int bch_btree_refill_keybuf(struct btree *b, struct btree_op *op, + struct keybuf *buf, struct bkey *end) +{ + struct btree_iter iter; + bch_btree_iter_init(b, &iter, &buf->last_scanned); + + while (!array_freelist_empty(&buf->freelist)) { + struct bkey *k = bch_btree_iter_next_filter(&iter, b, + bch_ptr_bad); + + if (!b->level) { + if (!k) { + buf->last_scanned = b->key; + break; + } + + buf->last_scanned = *k; + if (bkey_cmp(&buf->last_scanned, end) >= 0) + break; + + if (buf->key_predicate(buf, k)) { + struct keybuf_key *w; + + pr_debug("%s", pkey(k)); + + spin_lock(&buf->lock); + + w = array_alloc(&buf->freelist); + + w->private = NULL; + bkey_copy(&w->key, k); + + if (RB_INSERT(&buf->keys, w, node, keybuf_cmp)) + array_free(&buf->freelist, w); + + spin_unlock(&buf->lock); + } + } else { + if (!k) + break; + + btree(refill_keybuf, k, b, op, buf, end); + /* + * Might get an error here, but can't really do anything + * and it'll get logged elsewhere. Just read what we + * can. + */ + + if (bkey_cmp(&buf->last_scanned, end) >= 0) + break; + + cond_resched(); + } + } + + return 0; +} + +void bch_refill_keybuf(struct cache_set *c, struct keybuf *buf, + struct bkey *end) +{ + struct bkey start = buf->last_scanned; + struct btree_op op; + bch_btree_op_init_stack(&op); + + cond_resched(); + + btree_root(refill_keybuf, c, &op, buf, end); + closure_sync(&op.cl); + + pr_debug("found %s keys from %llu:%llu to %llu:%llu", + RB_EMPTY_ROOT(&buf->keys) ? "no" : + array_freelist_empty(&buf->freelist) ? "some" : "a few", + KEY_INODE(&start), KEY_OFFSET(&start), + KEY_INODE(&buf->last_scanned), KEY_OFFSET(&buf->last_scanned)); + + spin_lock(&buf->lock); + + if (!RB_EMPTY_ROOT(&buf->keys)) { + struct keybuf_key *w; + w = RB_FIRST(&buf->keys, struct keybuf_key, node); + buf->start = START_KEY(&w->key); + + w = RB_LAST(&buf->keys, struct keybuf_key, node); + buf->end = w->key; + } else { + buf->start = MAX_KEY; + buf->end = MAX_KEY; + } + + spin_unlock(&buf->lock); +} + +static void __bch_keybuf_del(struct keybuf *buf, struct keybuf_key *w) +{ + rb_erase(&w->node, &buf->keys); + array_free(&buf->freelist, w); +} + +void bch_keybuf_del(struct keybuf *buf, struct keybuf_key *w) +{ + spin_lock(&buf->lock); + __bch_keybuf_del(buf, w); + spin_unlock(&buf->lock); +} + +bool bch_keybuf_check_overlapping(struct keybuf *buf, struct bkey *start, + struct bkey *end) +{ + bool ret = false; + struct keybuf_key *p, *w, s; + s.key = *start; + + if (bkey_cmp(end, &buf->start) <= 0 || + bkey_cmp(start, &buf->end) >= 0) + return false; + + spin_lock(&buf->lock); + w = RB_GREATER(&buf->keys, s, node, keybuf_nonoverlapping_cmp); + + while (w && bkey_cmp(&START_KEY(&w->key), end) < 0) { + p = w; + w = RB_NEXT(w, node); + + if (p->private) + ret = true; + else + __bch_keybuf_del(buf, p); + } + + spin_unlock(&buf->lock); + return ret; +} + +struct keybuf_key *bch_keybuf_next(struct keybuf *buf) +{ + struct keybuf_key *w; + spin_lock(&buf->lock); + + w = RB_FIRST(&buf->keys, struct keybuf_key, node); + + while (w && w->private) + w = RB_NEXT(w, node); + + if (w) + w->private = ERR_PTR(-EINTR); + + spin_unlock(&buf->lock); + return w; +} + +struct keybuf_key *bch_keybuf_next_rescan(struct cache_set *c, + struct keybuf *buf, + struct bkey *end) +{ + struct keybuf_key *ret; + + while (1) { + ret = bch_keybuf_next(buf); + if (ret) + break; + + if (bkey_cmp(&buf->last_scanned, end) >= 0) { + pr_debug("scan finished"); + break; + } + + bch_refill_keybuf(c, buf, end); + } + + return ret; +} + +void bch_keybuf_init(struct keybuf *buf, keybuf_pred_fn *fn) +{ + buf->key_predicate = fn; + buf->last_scanned = MAX_KEY; + buf->keys = RB_ROOT; + + spin_lock_init(&buf->lock); + array_allocator_init(&buf->freelist); +} + +void bch_btree_exit(void) +{ + if (btree_io_wq) + destroy_workqueue(btree_io_wq); + if (bch_gc_wq) + destroy_workqueue(bch_gc_wq); +} + +int __init bch_btree_init(void) +{ + if (!(bch_gc_wq = create_singlethread_workqueue("bch_btree_gc")) || + !(btree_io_wq = create_singlethread_workqueue("bch_btree_io"))) + return -ENOMEM; + + return 0; +} diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h new file mode 100644 index 000000000000..af4a7092a28c --- /dev/null +++ b/drivers/md/bcache/btree.h @@ -0,0 +1,405 @@ +#ifndef _BCACHE_BTREE_H +#define _BCACHE_BTREE_H + +/* + * THE BTREE: + * + * At a high level, bcache's btree is relatively standard b+ tree. All keys and + * pointers are in the leaves; interior nodes only have pointers to the child + * nodes. + * + * In the interior nodes, a struct bkey always points to a child btree node, and + * the key is the highest key in the child node - except that the highest key in + * an interior node is always MAX_KEY. The size field refers to the size on disk + * of the child node - this would allow us to have variable sized btree nodes + * (handy for keeping the depth of the btree 1 by expanding just the root). + * + * Btree nodes are themselves log structured, but this is hidden fairly + * thoroughly. Btree nodes on disk will in practice have extents that overlap + * (because they were written at different times), but in memory we never have + * overlapping extents - when we read in a btree node from disk, the first thing + * we do is resort all the sets of keys with a mergesort, and in the same pass + * we check for overlapping extents and adjust them appropriately. + * + * struct btree_op is a central interface to the btree code. It's used for + * specifying read vs. write locking, and the embedded closure is used for + * waiting on IO or reserve memory. + * + * BTREE CACHE: + * + * Btree nodes are cached in memory; traversing the btree might require reading + * in btree nodes which is handled mostly transparently. + * + * bch_btree_node_get() looks up a btree node in the cache and reads it in from + * disk if necessary. This function is almost never called directly though - the + * btree() macro is used to get a btree node, call some function on it, and + * unlock the node after the function returns. + * + * The root is special cased - it's taken out of the cache's lru (thus pinning + * it in memory), so we can find the root of the btree by just dereferencing a + * pointer instead of looking it up in the cache. This makes locking a bit + * tricky, since the root pointer is protected by the lock in the btree node it + * points to - the btree_root() macro handles this. + * + * In various places we must be able to allocate memory for multiple btree nodes + * in order to make forward progress. To do this we use the btree cache itself + * as a reserve; if __get_free_pages() fails, we'll find a node in the btree + * cache we can reuse. We can't allow more than one thread to be doing this at a + * time, so there's a lock, implemented by a pointer to the btree_op closure - + * this allows the btree_root() macro to implicitly release this lock. + * + * BTREE IO: + * + * Btree nodes never have to be explicitly read in; bch_btree_node_get() handles + * this. + * + * For writing, we have two btree_write structs embeddded in struct btree - one + * write in flight, and one being set up, and we toggle between them. + * + * Writing is done with a single function - bch_btree_write() really serves two + * different purposes and should be broken up into two different functions. When + * passing now = false, it merely indicates that the node is now dirty - calling + * it ensures that the dirty keys will be written at some point in the future. + * + * When passing now = true, bch_btree_write() causes a write to happen + * "immediately" (if there was already a write in flight, it'll cause the write + * to happen as soon as the previous write completes). It returns immediately + * though - but it takes a refcount on the closure in struct btree_op you passed + * to it, so a closure_sync() later can be used to wait for the write to + * complete. + * + * This is handy because btree_split() and garbage collection can issue writes + * in parallel, reducing the amount of time they have to hold write locks. + * + * LOCKING: + * + * When traversing the btree, we may need write locks starting at some level - + * inserting a key into the btree will typically only require a write lock on + * the leaf node. + * + * This is specified with the lock field in struct btree_op; lock = 0 means we + * take write locks at level <= 0, i.e. only leaf nodes. bch_btree_node_get() + * checks this field and returns the node with the appropriate lock held. + * + * If, after traversing the btree, the insertion code discovers it has to split + * then it must restart from the root and take new locks - to do this it changes + * the lock field and returns -EINTR, which causes the btree_root() macro to + * loop. + * + * Handling cache misses require a different mechanism for upgrading to a write + * lock. We do cache lookups with only a read lock held, but if we get a cache + * miss and we wish to insert this data into the cache, we have to insert a + * placeholder key to detect races - otherwise, we could race with a write and + * overwrite the data that was just written to the cache with stale data from + * the backing device. + * + * For this we use a sequence number that write locks and unlocks increment - to + * insert the check key it unlocks the btree node and then takes a write lock, + * and fails if the sequence number doesn't match. + */ + +#include "bset.h" +#include "debug.h" + +struct btree_write { + struct closure *owner; + atomic_t *journal; + + /* If btree_split() frees a btree node, it writes a new pointer to that + * btree node indicating it was freed; it takes a refcount on + * c->prio_blocked because we can't write the gens until the new + * pointer is on disk. This allows btree_write_endio() to release the + * refcount that btree_split() took. + */ + int prio_blocked; +}; + +struct btree { + /* Hottest entries first */ + struct hlist_node hash; + + /* Key/pointer for this btree node */ + BKEY_PADDED(key); + + /* Single bit - set when accessed, cleared by shrinker */ + unsigned long accessed; + unsigned long seq; + struct rw_semaphore lock; + struct cache_set *c; + + unsigned long flags; + uint16_t written; /* would be nice to kill */ + uint8_t level; + uint8_t nsets; + uint8_t page_order; + + /* + * Set of sorted keys - the real btree node - plus a binary search tree + * + * sets[0] is special; set[0]->tree, set[0]->prev and set[0]->data point + * to the memory we have allocated for this btree node. Additionally, + * set[0]->data points to the entire btree node as it exists on disk. + */ + struct bset_tree sets[MAX_BSETS]; + + /* Used to refcount bio splits, also protects b->bio */ + struct closure_with_waitlist io; + + /* Gets transferred to w->prio_blocked - see the comment there */ + int prio_blocked; + + struct list_head list; + struct delayed_work work; + + uint64_t io_start_time; + struct btree_write writes[2]; + struct bio *bio; +}; + +#define BTREE_FLAG(flag) \ +static inline bool btree_node_ ## flag(struct btree *b) \ +{ return test_bit(BTREE_NODE_ ## flag, &b->flags); } \ + \ +static inline void set_btree_node_ ## flag(struct btree *b) \ +{ set_bit(BTREE_NODE_ ## flag, &b->flags); } \ + +enum btree_flags { + BTREE_NODE_read_done, + BTREE_NODE_io_error, + BTREE_NODE_dirty, + BTREE_NODE_write_idx, +}; + +BTREE_FLAG(read_done); +BTREE_FLAG(io_error); +BTREE_FLAG(dirty); +BTREE_FLAG(write_idx); + +static inline struct btree_write *btree_current_write(struct btree *b) +{ + return b->writes + btree_node_write_idx(b); +} + +static inline struct btree_write *btree_prev_write(struct btree *b) +{ + return b->writes + (btree_node_write_idx(b) ^ 1); +} + +static inline unsigned bset_offset(struct btree *b, struct bset *i) +{ + return (((size_t) i) - ((size_t) b->sets->data)) >> 9; +} + +static inline struct bset *write_block(struct btree *b) +{ + return ((void *) b->sets[0].data) + b->written * block_bytes(b->c); +} + +static inline bool bset_written(struct btree *b, struct bset_tree *t) +{ + return t->data < write_block(b); +} + +static inline bool bkey_written(struct btree *b, struct bkey *k) +{ + return k < write_block(b)->start; +} + +static inline void set_gc_sectors(struct cache_set *c) +{ + atomic_set(&c->sectors_to_gc, c->sb.bucket_size * c->nbuckets / 8); +} + +static inline bool bch_ptr_invalid(struct btree *b, const struct bkey *k) +{ + return __bch_ptr_invalid(b->c, b->level, k); +} + +static inline struct bkey *bch_btree_iter_init(struct btree *b, + struct btree_iter *iter, + struct bkey *search) +{ + return __bch_btree_iter_init(b, iter, search, b->sets); +} + +/* Looping macros */ + +#define for_each_cached_btree(b, c, iter) \ + for (iter = 0; \ + iter < ARRAY_SIZE((c)->bucket_hash); \ + iter++) \ + hlist_for_each_entry_rcu((b), (c)->bucket_hash + iter, hash) + +#define for_each_key_filter(b, k, iter, filter) \ + for (bch_btree_iter_init((b), (iter), NULL); \ + ((k) = bch_btree_iter_next_filter((iter), b, filter));) + +#define for_each_key(b, k, iter) \ + for (bch_btree_iter_init((b), (iter), NULL); \ + ((k) = bch_btree_iter_next(iter));) + +/* Recursing down the btree */ + +struct btree_op { + struct closure cl; + struct cache_set *c; + + /* Journal entry we have a refcount on */ + atomic_t *journal; + + /* Bio to be inserted into the cache */ + struct bio *cache_bio; + + unsigned inode; + + uint16_t write_prio; + + /* Btree level at which we start taking write locks */ + short lock; + + /* Btree insertion type */ + enum { + BTREE_INSERT, + BTREE_REPLACE + } type:8; + + unsigned csum:1; + unsigned skip:1; + unsigned flush_journal:1; + + unsigned insert_data_done:1; + unsigned lookup_done:1; + unsigned insert_collision:1; + + /* Anything after this point won't get zeroed in do_bio_hook() */ + + /* Keys to be inserted */ + struct keylist keys; + BKEY_PADDED(replace); +}; + +void bch_btree_op_init_stack(struct btree_op *); + +static inline void rw_lock(bool w, struct btree *b, int level) +{ + w ? down_write_nested(&b->lock, level + 1) + : down_read_nested(&b->lock, level + 1); + if (w) + b->seq++; +} + +static inline void rw_unlock(bool w, struct btree *b) +{ +#ifdef CONFIG_BCACHE_EDEBUG + unsigned i; + + if (w && + b->key.ptr[0] && + btree_node_read_done(b)) + for (i = 0; i <= b->nsets; i++) + bch_check_key_order(b, b->sets[i].data); +#endif + + if (w) + b->seq++; + (w ? up_write : up_read)(&b->lock); +} + +#define insert_lock(s, b) ((b)->level <= (s)->lock) + +/* + * These macros are for recursing down the btree - they handle the details of + * locking and looking up nodes in the cache for you. They're best treated as + * mere syntax when reading code that uses them. + * + * op->lock determines whether we take a read or a write lock at a given depth. + * If you've got a read lock and find that you need a write lock (i.e. you're + * going to have to split), set op->lock and return -EINTR; btree_root() will + * call you again and you'll have the correct lock. + */ + +/** + * btree - recurse down the btree on a specified key + * @fn: function to call, which will be passed the child node + * @key: key to recurse on + * @b: parent btree node + * @op: pointer to struct btree_op + */ +#define btree(fn, key, b, op, ...) \ +({ \ + int _r, l = (b)->level - 1; \ + bool _w = l <= (op)->lock; \ + struct btree *_b = bch_btree_node_get((b)->c, key, l, op); \ + if (!IS_ERR(_b)) { \ + _r = bch_btree_ ## fn(_b, op, ##__VA_ARGS__); \ + rw_unlock(_w, _b); \ + } else \ + _r = PTR_ERR(_b); \ + _r; \ +}) + +/** + * btree_root - call a function on the root of the btree + * @fn: function to call, which will be passed the child node + * @c: cache set + * @op: pointer to struct btree_op + */ +#define btree_root(fn, c, op, ...) \ +({ \ + int _r = -EINTR; \ + do { \ + struct btree *_b = (c)->root; \ + bool _w = insert_lock(op, _b); \ + rw_lock(_w, _b, _b->level); \ + if (_b == (c)->root && \ + _w == insert_lock(op, _b)) \ + _r = bch_btree_ ## fn(_b, op, ##__VA_ARGS__); \ + rw_unlock(_w, _b); \ + bch_cannibalize_unlock(c, &(op)->cl); \ + } while (_r == -EINTR); \ + \ + _r; \ +}) + +static inline bool should_split(struct btree *b) +{ + struct bset *i = write_block(b); + return b->written >= btree_blocks(b) || + (i->seq == b->sets[0].data->seq && + b->written + __set_blocks(i, i->keys + 15, b->c) + > btree_blocks(b)); +} + +void bch_btree_read_done(struct closure *); +void bch_btree_read(struct btree *); +void bch_btree_write(struct btree *b, bool now, struct btree_op *op); + +void bch_cannibalize_unlock(struct cache_set *, struct closure *); +void bch_btree_set_root(struct btree *); +struct btree *bch_btree_node_alloc(struct cache_set *, int, struct closure *); +struct btree *bch_btree_node_get(struct cache_set *, struct bkey *, + int, struct btree_op *); + +bool bch_btree_insert_keys(struct btree *, struct btree_op *); +bool bch_btree_insert_check_key(struct btree *, struct btree_op *, + struct bio *); +int bch_btree_insert(struct btree_op *, struct cache_set *); + +int bch_btree_search_recurse(struct btree *, struct btree_op *); + +void bch_queue_gc(struct cache_set *); +size_t bch_btree_gc_finish(struct cache_set *); +void bch_moving_gc(struct closure *); +int bch_btree_check(struct cache_set *, struct btree_op *); +uint8_t __bch_btree_mark_key(struct cache_set *, int, struct bkey *); + +void bch_keybuf_init(struct keybuf *, keybuf_pred_fn *); +void bch_refill_keybuf(struct cache_set *, struct keybuf *, struct bkey *); +bool bch_keybuf_check_overlapping(struct keybuf *, struct bkey *, + struct bkey *); +void bch_keybuf_del(struct keybuf *, struct keybuf_key *); +struct keybuf_key *bch_keybuf_next(struct keybuf *); +struct keybuf_key *bch_keybuf_next_rescan(struct cache_set *, + struct keybuf *, struct bkey *); + +#endif diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c new file mode 100644 index 000000000000..bd05a9a8c7cf --- /dev/null +++ b/drivers/md/bcache/closure.c @@ -0,0 +1,345 @@ +/* + * Asynchronous refcounty things + * + * Copyright 2010, 2011 Kent Overstreet + * Copyright 2012 Google, Inc. + */ + +#include +#include +#include + +#include "closure.h" + +void closure_queue(struct closure *cl) +{ + struct workqueue_struct *wq = cl->wq; + if (wq) { + INIT_WORK(&cl->work, cl->work.func); + BUG_ON(!queue_work(wq, &cl->work)); + } else + cl->fn(cl); +} +EXPORT_SYMBOL_GPL(closure_queue); + +#define CL_FIELD(type, field) \ + case TYPE_ ## type: \ + return &container_of(cl, struct type, cl)->field + +static struct closure_waitlist *closure_waitlist(struct closure *cl) +{ + switch (cl->type) { + CL_FIELD(closure_with_waitlist, wait); + CL_FIELD(closure_with_waitlist_and_timer, wait); + default: + return NULL; + } +} + +static struct timer_list *closure_timer(struct closure *cl) +{ + switch (cl->type) { + CL_FIELD(closure_with_timer, timer); + CL_FIELD(closure_with_waitlist_and_timer, timer); + default: + return NULL; + } +} + +static inline void closure_put_after_sub(struct closure *cl, int flags) +{ + int r = flags & CLOSURE_REMAINING_MASK; + + BUG_ON(flags & CLOSURE_GUARD_MASK); + BUG_ON(!r && (flags & ~(CLOSURE_DESTRUCTOR|CLOSURE_BLOCKING))); + + /* Must deliver precisely one wakeup */ + if (r == 1 && (flags & CLOSURE_SLEEPING)) + wake_up_process(cl->task); + + if (!r) { + if (cl->fn && !(flags & CLOSURE_DESTRUCTOR)) { + /* CLOSURE_BLOCKING might be set - clear it */ + atomic_set(&cl->remaining, + CLOSURE_REMAINING_INITIALIZER); + closure_queue(cl); + } else { + struct closure *parent = cl->parent; + struct closure_waitlist *wait = closure_waitlist(cl); + + closure_debug_destroy(cl); + + atomic_set(&cl->remaining, -1); + + if (wait) + closure_wake_up(wait); + + if (cl->fn) + cl->fn(cl); + + if (parent) + closure_put(parent); + } + } +} + +/* For clearing flags with the same atomic op as a put */ +void closure_sub(struct closure *cl, int v) +{ + closure_put_after_sub(cl, atomic_sub_return(v, &cl->remaining)); +} +EXPORT_SYMBOL_GPL(closure_sub); + +void closure_put(struct closure *cl) +{ + closure_put_after_sub(cl, atomic_dec_return(&cl->remaining)); +} +EXPORT_SYMBOL_GPL(closure_put); + +static void set_waiting(struct closure *cl, unsigned long f) +{ +#ifdef CONFIG_BCACHE_CLOSURES_DEBUG + cl->waiting_on = f; +#endif +} + +void __closure_wake_up(struct closure_waitlist *wait_list) +{ + struct llist_node *list; + struct closure *cl; + struct llist_node *reverse = NULL; + + list = llist_del_all(&wait_list->list); + + /* We first reverse the list to preserve FIFO ordering and fairness */ + + while (list) { + struct llist_node *t = list; + list = llist_next(list); + + t->next = reverse; + reverse = t; + } + + /* Then do the wakeups */ + + while (reverse) { + cl = container_of(reverse, struct closure, list); + reverse = llist_next(reverse); + + set_waiting(cl, 0); + closure_sub(cl, CLOSURE_WAITING + 1); + } +} +EXPORT_SYMBOL_GPL(__closure_wake_up); + +bool closure_wait(struct closure_waitlist *list, struct closure *cl) +{ + if (atomic_read(&cl->remaining) & CLOSURE_WAITING) + return false; + + set_waiting(cl, _RET_IP_); + atomic_add(CLOSURE_WAITING + 1, &cl->remaining); + llist_add(&cl->list, &list->list); + + return true; +} +EXPORT_SYMBOL_GPL(closure_wait); + +/** + * closure_sync() - sleep until a closure a closure has nothing left to wait on + * + * Sleeps until the refcount hits 1 - the thread that's running the closure owns + * the last refcount. + */ +void closure_sync(struct closure *cl) +{ + while (1) { + __closure_start_sleep(cl); + closure_set_ret_ip(cl); + + if ((atomic_read(&cl->remaining) & + CLOSURE_REMAINING_MASK) == 1) + break; + + schedule(); + } + + __closure_end_sleep(cl); +} +EXPORT_SYMBOL_GPL(closure_sync); + +/** + * closure_trylock() - try to acquire the closure, without waiting + * @cl: closure to lock + * + * Returns true if the closure was succesfully locked. + */ +bool closure_trylock(struct closure *cl, struct closure *parent) +{ + if (atomic_cmpxchg(&cl->remaining, -1, + CLOSURE_REMAINING_INITIALIZER) != -1) + return false; + + closure_set_ret_ip(cl); + + smp_mb(); + cl->parent = parent; + if (parent) + closure_get(parent); + + closure_debug_create(cl); + return true; +} +EXPORT_SYMBOL_GPL(closure_trylock); + +void __closure_lock(struct closure *cl, struct closure *parent, + struct closure_waitlist *wait_list) +{ + struct closure wait; + closure_init_stack(&wait); + + while (1) { + if (closure_trylock(cl, parent)) + return; + + closure_wait_event_sync(wait_list, &wait, + atomic_read(&cl->remaining) == -1); + } +} +EXPORT_SYMBOL_GPL(__closure_lock); + +static void closure_delay_timer_fn(unsigned long data) +{ + struct closure *cl = (struct closure *) data; + closure_sub(cl, CLOSURE_TIMER + 1); +} + +void do_closure_timer_init(struct closure *cl) +{ + struct timer_list *timer = closure_timer(cl); + + init_timer(timer); + timer->data = (unsigned long) cl; + timer->function = closure_delay_timer_fn; +} +EXPORT_SYMBOL_GPL(do_closure_timer_init); + +bool __closure_delay(struct closure *cl, unsigned long delay, + struct timer_list *timer) +{ + if (atomic_read(&cl->remaining) & CLOSURE_TIMER) + return false; + + BUG_ON(timer_pending(timer)); + + timer->expires = jiffies + delay; + + atomic_add(CLOSURE_TIMER + 1, &cl->remaining); + add_timer(timer); + return true; +} +EXPORT_SYMBOL_GPL(__closure_delay); + +void __closure_flush(struct closure *cl, struct timer_list *timer) +{ + if (del_timer(timer)) + closure_sub(cl, CLOSURE_TIMER + 1); +} +EXPORT_SYMBOL_GPL(__closure_flush); + +void __closure_flush_sync(struct closure *cl, struct timer_list *timer) +{ + if (del_timer_sync(timer)) + closure_sub(cl, CLOSURE_TIMER + 1); +} +EXPORT_SYMBOL_GPL(__closure_flush_sync); + +#ifdef CONFIG_BCACHE_CLOSURES_DEBUG + +static LIST_HEAD(closure_list); +static DEFINE_SPINLOCK(closure_list_lock); + +void closure_debug_create(struct closure *cl) +{ + unsigned long flags; + + BUG_ON(cl->magic == CLOSURE_MAGIC_ALIVE); + cl->magic = CLOSURE_MAGIC_ALIVE; + + spin_lock_irqsave(&closure_list_lock, flags); + list_add(&cl->all, &closure_list); + spin_unlock_irqrestore(&closure_list_lock, flags); +} +EXPORT_SYMBOL_GPL(closure_debug_create); + +void closure_debug_destroy(struct closure *cl) +{ + unsigned long flags; + + BUG_ON(cl->magic != CLOSURE_MAGIC_ALIVE); + cl->magic = CLOSURE_MAGIC_DEAD; + + spin_lock_irqsave(&closure_list_lock, flags); + list_del(&cl->all); + spin_unlock_irqrestore(&closure_list_lock, flags); +} +EXPORT_SYMBOL_GPL(closure_debug_destroy); + +static struct dentry *debug; + +#define work_data_bits(work) ((unsigned long *)(&(work)->data)) + +static int debug_seq_show(struct seq_file *f, void *data) +{ + struct closure *cl; + spin_lock_irq(&closure_list_lock); + + list_for_each_entry(cl, &closure_list, all) { + int r = atomic_read(&cl->remaining); + + seq_printf(f, "%p: %pF -> %pf p %p r %i ", + cl, (void *) cl->ip, cl->fn, cl->parent, + r & CLOSURE_REMAINING_MASK); + + seq_printf(f, "%s%s%s%s%s%s\n", + test_bit(WORK_STRUCT_PENDING, + work_data_bits(&cl->work)) ? "Q" : "", + r & CLOSURE_RUNNING ? "R" : "", + r & CLOSURE_BLOCKING ? "B" : "", + r & CLOSURE_STACK ? "S" : "", + r & CLOSURE_SLEEPING ? "Sl" : "", + r & CLOSURE_TIMER ? "T" : ""); + + if (r & CLOSURE_WAITING) + seq_printf(f, " W %pF\n", + (void *) cl->waiting_on); + + seq_printf(f, "\n"); + } + + spin_unlock_irq(&closure_list_lock); + return 0; +} + +static int debug_seq_open(struct inode *inode, struct file *file) +{ + return single_open(file, debug_seq_show, NULL); +} + +static const struct file_operations debug_ops = { + .owner = THIS_MODULE, + .open = debug_seq_open, + .read = seq_read, + .release = single_release +}; + +void __init closure_debug_init(void) +{ + debug = debugfs_create_file("closures", 0400, NULL, NULL, &debug_ops); +} + +#endif + +MODULE_AUTHOR("Kent Overstreet "); +MODULE_LICENSE("GPL"); diff --git a/drivers/md/bcache/closure.h b/drivers/md/bcache/closure.h new file mode 100644 index 000000000000..00039924ea9d --- /dev/null +++ b/drivers/md/bcache/closure.h @@ -0,0 +1,672 @@ +#ifndef _LINUX_CLOSURE_H +#define _LINUX_CLOSURE_H + +#include +#include +#include + +/* + * Closure is perhaps the most overused and abused term in computer science, but + * since I've been unable to come up with anything better you're stuck with it + * again. + * + * What are closures? + * + * They embed a refcount. The basic idea is they count "things that are in + * progress" - in flight bios, some other thread that's doing something else - + * anything you might want to wait on. + * + * The refcount may be manipulated with closure_get() and closure_put(). + * closure_put() is where many of the interesting things happen, when it causes + * the refcount to go to 0. + * + * Closures can be used to wait on things both synchronously and asynchronously, + * and synchronous and asynchronous use can be mixed without restriction. To + * wait synchronously, use closure_sync() - you will sleep until your closure's + * refcount hits 1. + * + * To wait asynchronously, use + * continue_at(cl, next_function, workqueue); + * + * passing it, as you might expect, the function to run when nothing is pending + * and the workqueue to run that function out of. + * + * continue_at() also, critically, is a macro that returns the calling function. + * There's good reason for this. + * + * To use safely closures asynchronously, they must always have a refcount while + * they are running owned by the thread that is running them. Otherwise, suppose + * you submit some bios and wish to have a function run when they all complete: + * + * foo_endio(struct bio *bio, int error) + * { + * closure_put(cl); + * } + * + * closure_init(cl); + * + * do_stuff(); + * closure_get(cl); + * bio1->bi_endio = foo_endio; + * bio_submit(bio1); + * + * do_more_stuff(); + * closure_get(cl); + * bio2->bi_endio = foo_endio; + * bio_submit(bio2); + * + * continue_at(cl, complete_some_read, system_wq); + * + * If closure's refcount started at 0, complete_some_read() could run before the + * second bio was submitted - which is almost always not what you want! More + * importantly, it wouldn't be possible to say whether the original thread or + * complete_some_read()'s thread owned the closure - and whatever state it was + * associated with! + * + * So, closure_init() initializes a closure's refcount to 1 - and when a + * closure_fn is run, the refcount will be reset to 1 first. + * + * Then, the rule is - if you got the refcount with closure_get(), release it + * with closure_put() (i.e, in a bio->bi_endio function). If you have a refcount + * on a closure because you called closure_init() or you were run out of a + * closure - _always_ use continue_at(). Doing so consistently will help + * eliminate an entire class of particularly pernicious races. + * + * For a closure to wait on an arbitrary event, we need to introduce waitlists: + * + * struct closure_waitlist list; + * closure_wait_event(list, cl, condition); + * closure_wake_up(wait_list); + * + * These work analagously to wait_event() and wake_up() - except that instead of + * operating on the current thread (for wait_event()) and lists of threads, they + * operate on an explicit closure and lists of closures. + * + * Because it's a closure we can now wait either synchronously or + * asynchronously. closure_wait_event() returns the current value of the + * condition, and if it returned false continue_at() or closure_sync() can be + * used to wait for it to become true. + * + * It's useful for waiting on things when you can't sleep in the context in + * which you must check the condition (perhaps a spinlock held, or you might be + * beneath generic_make_request() - in which case you can't sleep on IO). + * + * closure_wait_event() will wait either synchronously or asynchronously, + * depending on whether the closure is in blocking mode or not. You can pick a + * mode explicitly with closure_wait_event_sync() and + * closure_wait_event_async(), which do just what you might expect. + * + * Lastly, you might have a wait list dedicated to a specific event, and have no + * need for specifying the condition - you just want to wait until someone runs + * closure_wake_up() on the appropriate wait list. In that case, just use + * closure_wait(). It will return either true or false, depending on whether the + * closure was already on a wait list or not - a closure can only be on one wait + * list at a time. + * + * Parents: + * + * closure_init() takes two arguments - it takes the closure to initialize, and + * a (possibly null) parent. + * + * If parent is non null, the new closure will have a refcount for its lifetime; + * a closure is considered to be "finished" when its refcount hits 0 and the + * function to run is null. Hence + * + * continue_at(cl, NULL, NULL); + * + * returns up the (spaghetti) stack of closures, precisely like normal return + * returns up the C stack. continue_at() with non null fn is better thought of + * as doing a tail call. + * + * All this implies that a closure should typically be embedded in a particular + * struct (which its refcount will normally control the lifetime of), and that + * struct can very much be thought of as a stack frame. + * + * Locking: + * + * Closures are based on work items but they can be thought of as more like + * threads - in that like threads and unlike work items they have a well + * defined lifetime; they are created (with closure_init()) and eventually + * complete after a continue_at(cl, NULL, NULL). + * + * Suppose you've got some larger structure with a closure embedded in it that's + * used for periodically doing garbage collection. You only want one garbage + * collection happening at a time, so the natural thing to do is protect it with + * a lock. However, it's difficult to use a lock protecting a closure correctly + * because the unlock should come after the last continue_to() (additionally, if + * you're using the closure asynchronously a mutex won't work since a mutex has + * to be unlocked by the same process that locked it). + * + * So to make it less error prone and more efficient, we also have the ability + * to use closures as locks: + * + * closure_init_unlocked(); + * closure_trylock(); + * + * That's all we need for trylock() - the last closure_put() implicitly unlocks + * it for you. But for closure_lock(), we also need a wait list: + * + * struct closure_with_waitlist frobnicator_cl; + * + * closure_init_unlocked(&frobnicator_cl); + * closure_lock(&frobnicator_cl); + * + * A closure_with_waitlist embeds a closure and a wait list - much like struct + * delayed_work embeds a work item and a timer_list. The important thing is, use + * it exactly like you would a regular closure and closure_put() will magically + * handle everything for you. + * + * We've got closures that embed timers, too. They're called, appropriately + * enough: + * struct closure_with_timer; + * + * This gives you access to closure_delay(). It takes a refcount for a specified + * number of jiffies - you could then call closure_sync() (for a slightly + * convoluted version of msleep()) or continue_at() - which gives you the same + * effect as using a delayed work item, except you can reuse the work_struct + * already embedded in struct closure. + * + * Lastly, there's struct closure_with_waitlist_and_timer. It does what you + * probably expect, if you happen to need the features of both. (You don't + * really want to know how all this is implemented, but if I've done my job + * right you shouldn't have to care). + */ + +struct closure; +typedef void (closure_fn) (struct closure *); + +struct closure_waitlist { + struct llist_head list; +}; + +enum closure_type { + TYPE_closure = 0, + TYPE_closure_with_waitlist = 1, + TYPE_closure_with_timer = 2, + TYPE_closure_with_waitlist_and_timer = 3, + MAX_CLOSURE_TYPE = 3, +}; + +enum closure_state { + /* + * CLOSURE_BLOCKING: Causes closure_wait_event() to block, instead of + * waiting asynchronously + * + * CLOSURE_WAITING: Set iff the closure is on a waitlist. Must be set by + * the thread that owns the closure, and cleared by the thread that's + * waking up the closure. + * + * CLOSURE_SLEEPING: Must be set before a thread uses a closure to sleep + * - indicates that cl->task is valid and closure_put() may wake it up. + * Only set or cleared by the thread that owns the closure. + * + * CLOSURE_TIMER: Analagous to CLOSURE_WAITING, indicates that a closure + * has an outstanding timer. Must be set by the thread that owns the + * closure, and cleared by the timer function when the timer goes off. + * + * The rest are for debugging and don't affect behaviour: + * + * CLOSURE_RUNNING: Set when a closure is running (i.e. by + * closure_init() and when closure_put() runs then next function), and + * must be cleared before remaining hits 0. Primarily to help guard + * against incorrect usage and accidentally transferring references. + * continue_at() and closure_return() clear it for you, if you're doing + * something unusual you can use closure_set_dead() which also helps + * annotate where references are being transferred. + * + * CLOSURE_STACK: Sanity check - remaining should never hit 0 on a + * closure with this flag set + */ + + CLOSURE_BITS_START = (1 << 19), + CLOSURE_DESTRUCTOR = (1 << 19), + CLOSURE_BLOCKING = (1 << 21), + CLOSURE_WAITING = (1 << 23), + CLOSURE_SLEEPING = (1 << 25), + CLOSURE_TIMER = (1 << 27), + CLOSURE_RUNNING = (1 << 29), + CLOSURE_STACK = (1 << 31), +}; + +#define CLOSURE_GUARD_MASK \ + ((CLOSURE_DESTRUCTOR|CLOSURE_BLOCKING|CLOSURE_WAITING| \ + CLOSURE_SLEEPING|CLOSURE_TIMER|CLOSURE_RUNNING|CLOSURE_STACK) << 1) + +#define CLOSURE_REMAINING_MASK (CLOSURE_BITS_START - 1) +#define CLOSURE_REMAINING_INITIALIZER (1|CLOSURE_RUNNING) + +struct closure { + union { + struct { + struct workqueue_struct *wq; + struct task_struct *task; + struct llist_node list; + closure_fn *fn; + }; + struct work_struct work; + }; + + struct closure *parent; + + atomic_t remaining; + + enum closure_type type; + +#ifdef CONFIG_BCACHE_CLOSURES_DEBUG +#define CLOSURE_MAGIC_DEAD 0xc054dead +#define CLOSURE_MAGIC_ALIVE 0xc054a11e + + unsigned magic; + struct list_head all; + unsigned long ip; + unsigned long waiting_on; +#endif +}; + +struct closure_with_waitlist { + struct closure cl; + struct closure_waitlist wait; +}; + +struct closure_with_timer { + struct closure cl; + struct timer_list timer; +}; + +struct closure_with_waitlist_and_timer { + struct closure cl; + struct closure_waitlist wait; + struct timer_list timer; +}; + +extern unsigned invalid_closure_type(void); + +#define __CLOSURE_TYPE(cl, _t) \ + __builtin_types_compatible_p(typeof(cl), struct _t) \ + ? TYPE_ ## _t : \ + +#define __closure_type(cl) \ +( \ + __CLOSURE_TYPE(cl, closure) \ + __CLOSURE_TYPE(cl, closure_with_waitlist) \ + __CLOSURE_TYPE(cl, closure_with_timer) \ + __CLOSURE_TYPE(cl, closure_with_waitlist_and_timer) \ + invalid_closure_type() \ +) + +void closure_sub(struct closure *cl, int v); +void closure_put(struct closure *cl); +void closure_queue(struct closure *cl); +void __closure_wake_up(struct closure_waitlist *list); +bool closure_wait(struct closure_waitlist *list, struct closure *cl); +void closure_sync(struct closure *cl); + +bool closure_trylock(struct closure *cl, struct closure *parent); +void __closure_lock(struct closure *cl, struct closure *parent, + struct closure_waitlist *wait_list); + +void do_closure_timer_init(struct closure *cl); +bool __closure_delay(struct closure *cl, unsigned long delay, + struct timer_list *timer); +void __closure_flush(struct closure *cl, struct timer_list *timer); +void __closure_flush_sync(struct closure *cl, struct timer_list *timer); + +#ifdef CONFIG_BCACHE_CLOSURES_DEBUG + +void closure_debug_init(void); +void closure_debug_create(struct closure *cl); +void closure_debug_destroy(struct closure *cl); + +#else + +static inline void closure_debug_init(void) {} +static inline void closure_debug_create(struct closure *cl) {} +static inline void closure_debug_destroy(struct closure *cl) {} + +#endif + +static inline void closure_set_ip(struct closure *cl) +{ +#ifdef CONFIG_BCACHE_CLOSURES_DEBUG + cl->ip = _THIS_IP_; +#endif +} + +static inline void closure_set_ret_ip(struct closure *cl) +{ +#ifdef CONFIG_BCACHE_CLOSURES_DEBUG + cl->ip = _RET_IP_; +#endif +} + +static inline void closure_get(struct closure *cl) +{ +#ifdef CONFIG_BCACHE_CLOSURES_DEBUG + BUG_ON((atomic_inc_return(&cl->remaining) & + CLOSURE_REMAINING_MASK) <= 1); +#else + atomic_inc(&cl->remaining); +#endif +} + +static inline void closure_set_stopped(struct closure *cl) +{ + atomic_sub(CLOSURE_RUNNING, &cl->remaining); +} + +static inline bool closure_is_stopped(struct closure *cl) +{ + return !(atomic_read(&cl->remaining) & CLOSURE_RUNNING); +} + +static inline bool closure_is_unlocked(struct closure *cl) +{ + return atomic_read(&cl->remaining) == -1; +} + +static inline void do_closure_init(struct closure *cl, struct closure *parent, + bool running) +{ + switch (cl->type) { + case TYPE_closure_with_timer: + case TYPE_closure_with_waitlist_and_timer: + do_closure_timer_init(cl); + default: + break; + } + + cl->parent = parent; + if (parent) + closure_get(parent); + + if (running) { + closure_debug_create(cl); + atomic_set(&cl->remaining, CLOSURE_REMAINING_INITIALIZER); + } else + atomic_set(&cl->remaining, -1); + + closure_set_ip(cl); +} + +/* + * Hack to get at the embedded closure if there is one, by doing an unsafe cast: + * the result of __closure_type() is thrown away, it's used merely for type + * checking. + */ +#define __to_internal_closure(cl) \ +({ \ + BUILD_BUG_ON(__closure_type(*cl) > MAX_CLOSURE_TYPE); \ + (struct closure *) cl; \ +}) + +#define closure_init_type(cl, parent, running) \ +do { \ + struct closure *_cl = __to_internal_closure(cl); \ + _cl->type = __closure_type(*(cl)); \ + do_closure_init(_cl, parent, running); \ +} while (0) + +/** + * __closure_init() - Initialize a closure, skipping the memset() + * + * May be used instead of closure_init() when memory has already been zeroed. + */ +#define __closure_init(cl, parent) \ + closure_init_type(cl, parent, true) + +/** + * closure_init() - Initialize a closure, setting the refcount to 1 + * @cl: closure to initialize + * @parent: parent of the new closure. cl will take a refcount on it for its + * lifetime; may be NULL. + */ +#define closure_init(cl, parent) \ +do { \ + memset((cl), 0, sizeof(*(cl))); \ + __closure_init(cl, parent); \ +} while (0) + +static inline void closure_init_stack(struct closure *cl) +{ + memset(cl, 0, sizeof(struct closure)); + atomic_set(&cl->remaining, CLOSURE_REMAINING_INITIALIZER| + CLOSURE_BLOCKING|CLOSURE_STACK); +} + +/** + * closure_init_unlocked() - Initialize a closure but leave it unlocked. + * @cl: closure to initialize + * + * For when the closure will be used as a lock. The closure may not be used + * until after a closure_lock() or closure_trylock(). + */ +#define closure_init_unlocked(cl) \ +do { \ + memset((cl), 0, sizeof(*(cl))); \ + closure_init_type(cl, NULL, false); \ +} while (0) + +/** + * closure_lock() - lock and initialize a closure. + * @cl: the closure to lock + * @parent: the new parent for this closure + * + * The closure must be of one of the types that has a waitlist (otherwise we + * wouldn't be able to sleep on contention). + * + * @parent has exactly the same meaning as in closure_init(); if non null, the + * closure will take a reference on @parent which will be released when it is + * unlocked. + */ +#define closure_lock(cl, parent) \ + __closure_lock(__to_internal_closure(cl), parent, &(cl)->wait) + +/** + * closure_delay() - delay some number of jiffies + * @cl: the closure that will sleep + * @delay: the delay in jiffies + * + * Takes a refcount on @cl which will be released after @delay jiffies; this may + * be used to have a function run after a delay with continue_at(), or + * closure_sync() may be used for a convoluted version of msleep(). + */ +#define closure_delay(cl, delay) \ + __closure_delay(__to_internal_closure(cl), delay, &(cl)->timer) + +#define closure_flush(cl) \ + __closure_flush(__to_internal_closure(cl), &(cl)->timer) + +#define closure_flush_sync(cl) \ + __closure_flush_sync(__to_internal_closure(cl), &(cl)->timer) + +static inline void __closure_end_sleep(struct closure *cl) +{ + __set_current_state(TASK_RUNNING); + + if (atomic_read(&cl->remaining) & CLOSURE_SLEEPING) + atomic_sub(CLOSURE_SLEEPING, &cl->remaining); +} + +static inline void __closure_start_sleep(struct closure *cl) +{ + closure_set_ip(cl); + cl->task = current; + set_current_state(TASK_UNINTERRUPTIBLE); + + if (!(atomic_read(&cl->remaining) & CLOSURE_SLEEPING)) + atomic_add(CLOSURE_SLEEPING, &cl->remaining); +} + +/** + * closure_blocking() - returns true if the closure is in blocking mode. + * + * If a closure is in blocking mode, closure_wait_event() will sleep until the + * condition is true instead of waiting asynchronously. + */ +static inline bool closure_blocking(struct closure *cl) +{ + return atomic_read(&cl->remaining) & CLOSURE_BLOCKING; +} + +/** + * set_closure_blocking() - put a closure in blocking mode. + * + * If a closure is in blocking mode, closure_wait_event() will sleep until the + * condition is true instead of waiting asynchronously. + * + * Not thread safe - can only be called by the thread running the closure. + */ +static inline void set_closure_blocking(struct closure *cl) +{ + if (!closure_blocking(cl)) + atomic_add(CLOSURE_BLOCKING, &cl->remaining); +} + +/* + * Not thread safe - can only be called by the thread running the closure. + */ +static inline void clear_closure_blocking(struct closure *cl) +{ + if (closure_blocking(cl)) + atomic_sub(CLOSURE_BLOCKING, &cl->remaining); +} + +/** + * closure_wake_up() - wake up all closures on a wait list. + */ +static inline void closure_wake_up(struct closure_waitlist *list) +{ + smp_mb(); + __closure_wake_up(list); +} + +/* + * Wait on an event, synchronously or asynchronously - analogous to wait_event() + * but for closures. + * + * The loop is oddly structured so as to avoid a race; we must check the + * condition again after we've added ourself to the waitlist. We know if we were + * already on the waitlist because closure_wait() returns false; thus, we only + * schedule or break if closure_wait() returns false. If it returns true, we + * just loop again - rechecking the condition. + * + * The __closure_wake_up() is necessary because we may race with the event + * becoming true; i.e. we see event false -> wait -> recheck condition, but the + * thread that made the event true may have called closure_wake_up() before we + * added ourself to the wait list. + * + * We have to call closure_sync() at the end instead of just + * __closure_end_sleep() because a different thread might've called + * closure_wake_up() before us and gotten preempted before they dropped the + * refcount on our closure. If this was a stack allocated closure, that would be + * bad. + */ +#define __closure_wait_event(list, cl, condition, _block) \ +({ \ + bool block = _block; \ + typeof(condition) ret; \ + \ + while (1) { \ + ret = (condition); \ + if (ret) { \ + __closure_wake_up(list); \ + if (block) \ + closure_sync(cl); \ + \ + break; \ + } \ + \ + if (block) \ + __closure_start_sleep(cl); \ + \ + if (!closure_wait(list, cl)) { \ + if (!block) \ + break; \ + \ + schedule(); \ + } \ + } \ + \ + ret; \ +}) + +/** + * closure_wait_event() - wait on a condition, synchronously or asynchronously. + * @list: the wait list to wait on + * @cl: the closure that is doing the waiting + * @condition: a C expression for the event to wait for + * + * If the closure is in blocking mode, sleeps until the @condition evaluates to + * true - exactly like wait_event(). + * + * If the closure is not in blocking mode, waits asynchronously; if the + * condition is currently false the @cl is put onto @list and returns. @list + * owns a refcount on @cl; closure_sync() or continue_at() may be used later to + * wait for another thread to wake up @list, which drops the refcount on @cl. + * + * Returns the value of @condition; @cl will be on @list iff @condition was + * false. + * + * closure_wake_up(@list) must be called after changing any variable that could + * cause @condition to become true. + */ +#define closure_wait_event(list, cl, condition) \ + __closure_wait_event(list, cl, condition, closure_blocking(cl)) + +#define closure_wait_event_async(list, cl, condition) \ + __closure_wait_event(list, cl, condition, false) + +#define closure_wait_event_sync(list, cl, condition) \ + __closure_wait_event(list, cl, condition, true) + +static inline void set_closure_fn(struct closure *cl, closure_fn *fn, + struct workqueue_struct *wq) +{ + BUG_ON(object_is_on_stack(cl)); + closure_set_ip(cl); + cl->fn = fn; + cl->wq = wq; + /* between atomic_dec() in closure_put() */ + smp_mb__before_atomic_dec(); +} + +#define continue_at(_cl, _fn, _wq) \ +do { \ + set_closure_fn(_cl, _fn, _wq); \ + closure_sub(_cl, CLOSURE_RUNNING + 1); \ + return; \ +} while (0) + +#define closure_return(_cl) continue_at((_cl), NULL, NULL) + +#define continue_at_nobarrier(_cl, _fn, _wq) \ +do { \ + set_closure_fn(_cl, _fn, _wq); \ + closure_queue(cl); \ + return; \ +} while (0) + +#define closure_return_with_destructor(_cl, _destructor) \ +do { \ + set_closure_fn(_cl, _destructor, NULL); \ + closure_sub(_cl, CLOSURE_RUNNING - CLOSURE_DESTRUCTOR + 1); \ + return; \ +} while (0) + +static inline void closure_call(struct closure *cl, closure_fn fn, + struct workqueue_struct *wq, + struct closure *parent) +{ + closure_init(cl, parent); + continue_at_nobarrier(cl, fn, wq); +} + +static inline void closure_trylock_call(struct closure *cl, closure_fn fn, + struct workqueue_struct *wq, + struct closure *parent) +{ + if (closure_trylock(cl, parent)) + continue_at_nobarrier(cl, fn, wq); +} + +#endif /* _LINUX_CLOSURE_H */ diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c new file mode 100644 index 000000000000..89fd5204924e --- /dev/null +++ b/drivers/md/bcache/debug.c @@ -0,0 +1,565 @@ +/* + * Assorted bcache debug code + * + * Copyright 2010, 2011 Kent Overstreet + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "btree.h" +#include "debug.h" +#include "request.h" + +#include +#include +#include +#include +#include + +static struct dentry *debug; + +const char *bch_ptr_status(struct cache_set *c, const struct bkey *k) +{ + unsigned i; + + for (i = 0; i < KEY_PTRS(k); i++) + if (ptr_available(c, k, i)) { + struct cache *ca = PTR_CACHE(c, k, i); + size_t bucket = PTR_BUCKET_NR(c, k, i); + size_t r = bucket_remainder(c, PTR_OFFSET(k, i)); + + if (KEY_SIZE(k) + r > c->sb.bucket_size) + return "bad, length too big"; + if (bucket < ca->sb.first_bucket) + return "bad, short offset"; + if (bucket >= ca->sb.nbuckets) + return "bad, offset past end of device"; + if (ptr_stale(c, k, i)) + return "stale"; + } + + if (!bkey_cmp(k, &ZERO_KEY)) + return "bad, null key"; + if (!KEY_PTRS(k)) + return "bad, no pointers"; + if (!KEY_SIZE(k)) + return "zeroed key"; + return ""; +} + +struct keyprint_hack bch_pkey(const struct bkey *k) +{ + unsigned i = 0; + struct keyprint_hack r; + char *out = r.s, *end = r.s + KEYHACK_SIZE; + +#define p(...) (out += scnprintf(out, end - out, __VA_ARGS__)) + + p("%llu:%llu len %llu -> [", KEY_INODE(k), KEY_OFFSET(k), KEY_SIZE(k)); + + if (KEY_PTRS(k)) + while (1) { + p("%llu:%llu gen %llu", + PTR_DEV(k, i), PTR_OFFSET(k, i), PTR_GEN(k, i)); + + if (++i == KEY_PTRS(k)) + break; + + p(", "); + } + + p("]"); + + if (KEY_DIRTY(k)) + p(" dirty"); + if (KEY_CSUM(k)) + p(" cs%llu %llx", KEY_CSUM(k), k->ptr[1]); +#undef p + return r; +} + +struct keyprint_hack bch_pbtree(const struct btree *b) +{ + struct keyprint_hack r; + + snprintf(r.s, 40, "%zu level %i/%i", PTR_BUCKET_NR(b->c, &b->key, 0), + b->level, b->c->root ? b->c->root->level : -1); + return r; +} + +#if defined(CONFIG_BCACHE_DEBUG) || defined(CONFIG_BCACHE_EDEBUG) + +static bool skipped_backwards(struct btree *b, struct bkey *k) +{ + return bkey_cmp(k, (!b->level) + ? &START_KEY(bkey_next(k)) + : bkey_next(k)) > 0; +} + +static void dump_bset(struct btree *b, struct bset *i) +{ + struct bkey *k; + unsigned j; + + for (k = i->start; k < end(i); k = bkey_next(k)) { + printk(KERN_ERR "block %zu key %zi/%u: %s", index(i, b), + (uint64_t *) k - i->d, i->keys, pkey(k)); + + for (j = 0; j < KEY_PTRS(k); j++) { + size_t n = PTR_BUCKET_NR(b->c, k, j); + printk(" bucket %zu", n); + + if (n >= b->c->sb.first_bucket && n < b->c->sb.nbuckets) + printk(" prio %i", + PTR_BUCKET(b->c, k, j)->prio); + } + + printk(" %s\n", bch_ptr_status(b->c, k)); + + if (bkey_next(k) < end(i) && + skipped_backwards(b, k)) + printk(KERN_ERR "Key skipped backwards\n"); + } +} + +#endif + +#ifdef CONFIG_BCACHE_DEBUG + +void bch_btree_verify(struct btree *b, struct bset *new) +{ + struct btree *v = b->c->verify_data; + struct closure cl; + closure_init_stack(&cl); + + if (!b->c->verify) + return; + + closure_wait_event(&b->io.wait, &cl, + atomic_read(&b->io.cl.remaining) == -1); + + mutex_lock(&b->c->verify_lock); + + bkey_copy(&v->key, &b->key); + v->written = 0; + v->level = b->level; + + bch_btree_read(v); + closure_wait_event(&v->io.wait, &cl, + atomic_read(&b->io.cl.remaining) == -1); + + if (new->keys != v->sets[0].data->keys || + memcmp(new->start, + v->sets[0].data->start, + (void *) end(new) - (void *) new->start)) { + unsigned i, j; + + console_lock(); + + printk(KERN_ERR "*** original memory node:\n"); + for (i = 0; i <= b->nsets; i++) + dump_bset(b, b->sets[i].data); + + printk(KERN_ERR "*** sorted memory node:\n"); + dump_bset(b, new); + + printk(KERN_ERR "*** on disk node:\n"); + dump_bset(v, v->sets[0].data); + + for (j = 0; j < new->keys; j++) + if (new->d[j] != v->sets[0].data->d[j]) + break; + + console_unlock(); + panic("verify failed at %u\n", j); + } + + mutex_unlock(&b->c->verify_lock); +} + +static void data_verify_endio(struct bio *bio, int error) +{ + struct closure *cl = bio->bi_private; + closure_put(cl); +} + +void bch_data_verify(struct search *s) +{ + char name[BDEVNAME_SIZE]; + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + struct closure *cl = &s->cl; + struct bio *check; + struct bio_vec *bv; + int i; + + if (!s->unaligned_bvec) + bio_for_each_segment(bv, s->orig_bio, i) + bv->bv_offset = 0, bv->bv_len = PAGE_SIZE; + + check = bio_clone(s->orig_bio, GFP_NOIO); + if (!check) + return; + + if (bch_bio_alloc_pages(check, GFP_NOIO)) + goto out_put; + + check->bi_rw = READ_SYNC; + check->bi_private = cl; + check->bi_end_io = data_verify_endio; + + closure_bio_submit(check, cl, &dc->disk); + closure_sync(cl); + + bio_for_each_segment(bv, s->orig_bio, i) { + void *p1 = kmap(bv->bv_page); + void *p2 = kmap(check->bi_io_vec[i].bv_page); + + if (memcmp(p1 + bv->bv_offset, + p2 + bv->bv_offset, + bv->bv_len)) + printk(KERN_ERR + "bcache (%s): verify failed at sector %llu\n", + bdevname(dc->bdev, name), + (uint64_t) s->orig_bio->bi_sector); + + kunmap(bv->bv_page); + kunmap(check->bi_io_vec[i].bv_page); + } + + __bio_for_each_segment(bv, check, i, 0) + __free_page(bv->bv_page); +out_put: + bio_put(check); +} + +#endif + +#ifdef CONFIG_BCACHE_EDEBUG + +unsigned bch_count_data(struct btree *b) +{ + unsigned ret = 0; + struct btree_iter iter; + struct bkey *k; + + if (!b->level) + for_each_key(b, k, &iter) + ret += KEY_SIZE(k); + return ret; +} + +static void vdump_bucket_and_panic(struct btree *b, const char *fmt, + va_list args) +{ + unsigned i; + + console_lock(); + + for (i = 0; i <= b->nsets; i++) + dump_bset(b, b->sets[i].data); + + vprintk(fmt, args); + + console_unlock(); + + panic("at %s\n", pbtree(b)); +} + +void bch_check_key_order_msg(struct btree *b, struct bset *i, + const char *fmt, ...) +{ + struct bkey *k; + + if (!i->keys) + return; + + for (k = i->start; bkey_next(k) < end(i); k = bkey_next(k)) + if (skipped_backwards(b, k)) { + va_list args; + va_start(args, fmt); + + vdump_bucket_and_panic(b, fmt, args); + va_end(args); + } +} + +void bch_check_keys(struct btree *b, const char *fmt, ...) +{ + va_list args; + struct bkey *k, *p = NULL; + struct btree_iter iter; + + if (b->level) + return; + + for_each_key(b, k, &iter) { + if (p && bkey_cmp(&START_KEY(p), &START_KEY(k)) > 0) { + printk(KERN_ERR "Keys out of order:\n"); + goto bug; + } + + if (bch_ptr_invalid(b, k)) + continue; + + if (p && bkey_cmp(p, &START_KEY(k)) > 0) { + printk(KERN_ERR "Overlapping keys:\n"); + goto bug; + } + p = k; + } + return; +bug: + va_start(args, fmt); + vdump_bucket_and_panic(b, fmt, args); + va_end(args); +} + +#endif + +#ifdef CONFIG_DEBUG_FS + +/* XXX: cache set refcounting */ + +struct dump_iterator { + char buf[PAGE_SIZE]; + size_t bytes; + struct cache_set *c; + struct keybuf keys; +}; + +static bool dump_pred(struct keybuf *buf, struct bkey *k) +{ + return true; +} + +static ssize_t bch_dump_read(struct file *file, char __user *buf, + size_t size, loff_t *ppos) +{ + struct dump_iterator *i = file->private_data; + ssize_t ret = 0; + + while (size) { + struct keybuf_key *w; + unsigned bytes = min(i->bytes, size); + + int err = copy_to_user(buf, i->buf, bytes); + if (err) + return err; + + ret += bytes; + buf += bytes; + size -= bytes; + i->bytes -= bytes; + memmove(i->buf, i->buf + bytes, i->bytes); + + if (i->bytes) + break; + + w = bch_keybuf_next_rescan(i->c, &i->keys, &MAX_KEY); + if (!w) + break; + + i->bytes = snprintf(i->buf, PAGE_SIZE, "%s\n", pkey(&w->key)); + bch_keybuf_del(&i->keys, w); + } + + return ret; +} + +static int bch_dump_open(struct inode *inode, struct file *file) +{ + struct cache_set *c = inode->i_private; + struct dump_iterator *i; + + i = kzalloc(sizeof(struct dump_iterator), GFP_KERNEL); + if (!i) + return -ENOMEM; + + file->private_data = i; + i->c = c; + bch_keybuf_init(&i->keys, dump_pred); + i->keys.last_scanned = KEY(0, 0, 0); + + return 0; +} + +static int bch_dump_release(struct inode *inode, struct file *file) +{ + kfree(file->private_data); + return 0; +} + +static const struct file_operations cache_set_debug_ops = { + .owner = THIS_MODULE, + .open = bch_dump_open, + .read = bch_dump_read, + .release = bch_dump_release +}; + +void bch_debug_init_cache_set(struct cache_set *c) +{ + if (!IS_ERR_OR_NULL(debug)) { + char name[50]; + snprintf(name, 50, "bcache-%pU", c->sb.set_uuid); + + c->debug = debugfs_create_file(name, 0400, debug, c, + &cache_set_debug_ops); + } +} + +#endif + +/* Fuzz tester has rotted: */ +#if 0 + +static ssize_t btree_fuzz(struct kobject *k, struct kobj_attribute *a, + const char *buffer, size_t size) +{ + void dump(struct btree *b) + { + struct bset *i; + + for (i = b->sets[0].data; + index(i, b) < btree_blocks(b) && + i->seq == b->sets[0].data->seq; + i = ((void *) i) + set_blocks(i, b->c) * block_bytes(b->c)) + dump_bset(b, i); + } + + struct cache_sb *sb; + struct cache_set *c; + struct btree *all[3], *b, *fill, *orig; + int j; + + struct btree_op op; + bch_btree_op_init_stack(&op); + + sb = kzalloc(sizeof(struct cache_sb), GFP_KERNEL); + if (!sb) + return -ENOMEM; + + sb->bucket_size = 128; + sb->block_size = 4; + + c = bch_cache_set_alloc(sb); + if (!c) + return -ENOMEM; + + for (j = 0; j < 3; j++) { + BUG_ON(list_empty(&c->btree_cache)); + all[j] = list_first_entry(&c->btree_cache, struct btree, list); + list_del_init(&all[j]->list); + + all[j]->key = KEY(0, 0, c->sb.bucket_size); + bkey_copy_key(&all[j]->key, &MAX_KEY); + } + + b = all[0]; + fill = all[1]; + orig = all[2]; + + while (1) { + for (j = 0; j < 3; j++) + all[j]->written = all[j]->nsets = 0; + + bch_bset_init_next(b); + + while (1) { + struct bset *i = write_block(b); + struct bkey *k = op.keys.top; + unsigned rand; + + bkey_init(k); + rand = get_random_int(); + + op.type = rand & 1 + ? BTREE_INSERT + : BTREE_REPLACE; + rand >>= 1; + + SET_KEY_SIZE(k, bucket_remainder(c, rand)); + rand >>= c->bucket_bits; + rand &= 1024 * 512 - 1; + rand += c->sb.bucket_size; + SET_KEY_OFFSET(k, rand); +#if 0 + SET_KEY_PTRS(k, 1); +#endif + bch_keylist_push(&op.keys); + bch_btree_insert_keys(b, &op); + + if (should_split(b) || + set_blocks(i, b->c) != + __set_blocks(i, i->keys + 15, b->c)) { + i->csum = csum_set(i); + + memcpy(write_block(fill), + i, set_bytes(i)); + + b->written += set_blocks(i, b->c); + fill->written = b->written; + if (b->written == btree_blocks(b)) + break; + + bch_btree_sort_lazy(b); + bch_bset_init_next(b); + } + } + + memcpy(orig->sets[0].data, + fill->sets[0].data, + btree_bytes(c)); + + bch_btree_sort(b); + fill->written = 0; + bch_btree_read_done(&fill->io.cl); + + if (b->sets[0].data->keys != fill->sets[0].data->keys || + memcmp(b->sets[0].data->start, + fill->sets[0].data->start, + b->sets[0].data->keys * sizeof(uint64_t))) { + struct bset *i = b->sets[0].data; + struct bkey *k, *l; + + for (k = i->start, + l = fill->sets[0].data->start; + k < end(i); + k = bkey_next(k), l = bkey_next(l)) + if (bkey_cmp(k, l) || + KEY_SIZE(k) != KEY_SIZE(l)) + pr_err("key %zi differs: %s != %s", + (uint64_t *) k - i->d, + pkey(k), pkey(l)); + + for (j = 0; j < 3; j++) { + pr_err("**** Set %i ****", j); + dump(all[j]); + } + panic("\n"); + } + + pr_info("fuzz complete: %i keys", b->sets[0].data->keys); + } +} + +kobj_attribute_write(fuzz, btree_fuzz); +#endif + +void bch_debug_exit(void) +{ + if (!IS_ERR_OR_NULL(debug)) + debugfs_remove_recursive(debug); +} + +int __init bch_debug_init(struct kobject *kobj) +{ + int ret = 0; +#if 0 + ret = sysfs_create_file(kobj, &ksysfs_fuzz.attr); + if (ret) + return ret; +#endif + + debug = debugfs_create_dir("bcache", NULL); + return ret; +} diff --git a/drivers/md/bcache/debug.h b/drivers/md/bcache/debug.h new file mode 100644 index 000000000000..f9378a218148 --- /dev/null +++ b/drivers/md/bcache/debug.h @@ -0,0 +1,54 @@ +#ifndef _BCACHE_DEBUG_H +#define _BCACHE_DEBUG_H + +/* Btree/bkey debug printing */ + +#define KEYHACK_SIZE 80 +struct keyprint_hack { + char s[KEYHACK_SIZE]; +}; + +struct keyprint_hack bch_pkey(const struct bkey *k); +struct keyprint_hack bch_pbtree(const struct btree *b); +#define pkey(k) (&bch_pkey(k).s[0]) +#define pbtree(b) (&bch_pbtree(b).s[0]) + +#ifdef CONFIG_BCACHE_EDEBUG + +unsigned bch_count_data(struct btree *); +void bch_check_key_order_msg(struct btree *, struct bset *, const char *, ...); +void bch_check_keys(struct btree *, const char *, ...); + +#define bch_check_key_order(b, i) \ + bch_check_key_order_msg(b, i, "keys out of order") +#define EBUG_ON(cond) BUG_ON(cond) + +#else /* EDEBUG */ + +#define bch_count_data(b) 0 +#define bch_check_key_order(b, i) do {} while (0) +#define bch_check_key_order_msg(b, i, ...) do {} while (0) +#define bch_check_keys(b, ...) do {} while (0) +#define EBUG_ON(cond) do {} while (0) + +#endif + +#ifdef CONFIG_BCACHE_DEBUG + +void bch_btree_verify(struct btree *, struct bset *); +void bch_data_verify(struct search *); + +#else /* DEBUG */ + +static inline void bch_btree_verify(struct btree *b, struct bset *i) {} +static inline void bch_data_verify(struct search *s) {}; + +#endif + +#ifdef CONFIG_DEBUG_FS +void bch_debug_init_cache_set(struct cache_set *); +#else +static inline void bch_debug_init_cache_set(struct cache_set *c) {} +#endif + +#endif diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c new file mode 100644 index 000000000000..48efd4dea645 --- /dev/null +++ b/drivers/md/bcache/io.c @@ -0,0 +1,397 @@ +/* + * Some low level IO code, and hacks for various block layer limitations + * + * Copyright 2010, 2011 Kent Overstreet + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "bset.h" +#include "debug.h" + +static void bch_bi_idx_hack_endio(struct bio *bio, int error) +{ + struct bio *p = bio->bi_private; + + bio_endio(p, error); + bio_put(bio); +} + +static void bch_generic_make_request_hack(struct bio *bio) +{ + if (bio->bi_idx) { + struct bio *clone = bio_alloc(GFP_NOIO, bio_segments(bio)); + + memcpy(clone->bi_io_vec, + bio_iovec(bio), + bio_segments(bio) * sizeof(struct bio_vec)); + + clone->bi_sector = bio->bi_sector; + clone->bi_bdev = bio->bi_bdev; + clone->bi_rw = bio->bi_rw; + clone->bi_vcnt = bio_segments(bio); + clone->bi_size = bio->bi_size; + + clone->bi_private = bio; + clone->bi_end_io = bch_bi_idx_hack_endio; + + bio = clone; + } + + /* + * Hack, since drivers that clone bios clone up to bi_max_vecs, but our + * bios might have had more than that (before we split them per device + * limitations). + * + * To be taken out once immutable bvec stuff is in. + */ + bio->bi_max_vecs = bio->bi_vcnt; + + generic_make_request(bio); +} + +/** + * bch_bio_split - split a bio + * @bio: bio to split + * @sectors: number of sectors to split from the front of @bio + * @gfp: gfp mask + * @bs: bio set to allocate from + * + * Allocates and returns a new bio which represents @sectors from the start of + * @bio, and updates @bio to represent the remaining sectors. + * + * If bio_sectors(@bio) was less than or equal to @sectors, returns @bio + * unchanged. + * + * The newly allocated bio will point to @bio's bi_io_vec, if the split was on a + * bvec boundry; it is the caller's responsibility to ensure that @bio is not + * freed before the split. + * + * If bch_bio_split() is running under generic_make_request(), it's not safe to + * allocate more than one bio from the same bio set. Therefore, if it is running + * under generic_make_request() it masks out __GFP_WAIT when doing the + * allocation. The caller must check for failure if there's any possibility of + * it being called from under generic_make_request(); it is then the caller's + * responsibility to retry from a safe context (by e.g. punting to workqueue). + */ +struct bio *bch_bio_split(struct bio *bio, int sectors, + gfp_t gfp, struct bio_set *bs) +{ + unsigned idx = bio->bi_idx, vcnt = 0, nbytes = sectors << 9; + struct bio_vec *bv; + struct bio *ret = NULL; + + BUG_ON(sectors <= 0); + + /* + * If we're being called from underneath generic_make_request() and we + * already allocated any bios from this bio set, we risk deadlock if we + * use the mempool. So instead, we possibly fail and let the caller punt + * to workqueue or somesuch and retry in a safe context. + */ + if (current->bio_list) + gfp &= ~__GFP_WAIT; + + if (sectors >= bio_sectors(bio)) + return bio; + + if (bio->bi_rw & REQ_DISCARD) { + ret = bio_alloc_bioset(gfp, 1, bs); + idx = 0; + goto out; + } + + bio_for_each_segment(bv, bio, idx) { + vcnt = idx - bio->bi_idx; + + if (!nbytes) { + ret = bio_alloc_bioset(gfp, vcnt, bs); + if (!ret) + return NULL; + + memcpy(ret->bi_io_vec, bio_iovec(bio), + sizeof(struct bio_vec) * vcnt); + + break; + } else if (nbytes < bv->bv_len) { + ret = bio_alloc_bioset(gfp, ++vcnt, bs); + if (!ret) + return NULL; + + memcpy(ret->bi_io_vec, bio_iovec(bio), + sizeof(struct bio_vec) * vcnt); + + ret->bi_io_vec[vcnt - 1].bv_len = nbytes; + bv->bv_offset += nbytes; + bv->bv_len -= nbytes; + break; + } + + nbytes -= bv->bv_len; + } +out: + ret->bi_bdev = bio->bi_bdev; + ret->bi_sector = bio->bi_sector; + ret->bi_size = sectors << 9; + ret->bi_rw = bio->bi_rw; + ret->bi_vcnt = vcnt; + ret->bi_max_vecs = vcnt; + + bio->bi_sector += sectors; + bio->bi_size -= sectors << 9; + bio->bi_idx = idx; + + if (bio_integrity(bio)) { + if (bio_integrity_clone(ret, bio, gfp)) { + bio_put(ret); + return NULL; + } + + bio_integrity_trim(ret, 0, bio_sectors(ret)); + bio_integrity_trim(bio, bio_sectors(ret), bio_sectors(bio)); + } + + return ret; +} + +static unsigned bch_bio_max_sectors(struct bio *bio) +{ + unsigned ret = bio_sectors(bio); + struct request_queue *q = bdev_get_queue(bio->bi_bdev); + unsigned max_segments = min_t(unsigned, BIO_MAX_PAGES, + queue_max_segments(q)); + struct bio_vec *bv, *end = bio_iovec(bio) + + min_t(int, bio_segments(bio), max_segments); + + if (bio->bi_rw & REQ_DISCARD) + return min(ret, q->limits.max_discard_sectors); + + if (bio_segments(bio) > max_segments || + q->merge_bvec_fn) { + ret = 0; + + for (bv = bio_iovec(bio); bv < end; bv++) { + struct bvec_merge_data bvm = { + .bi_bdev = bio->bi_bdev, + .bi_sector = bio->bi_sector, + .bi_size = ret << 9, + .bi_rw = bio->bi_rw, + }; + + if (q->merge_bvec_fn && + q->merge_bvec_fn(q, &bvm, bv) < (int) bv->bv_len) + break; + + ret += bv->bv_len >> 9; + } + } + + ret = min(ret, queue_max_sectors(q)); + + WARN_ON(!ret); + ret = max_t(int, ret, bio_iovec(bio)->bv_len >> 9); + + return ret; +} + +static void bch_bio_submit_split_done(struct closure *cl) +{ + struct bio_split_hook *s = container_of(cl, struct bio_split_hook, cl); + + s->bio->bi_end_io = s->bi_end_io; + s->bio->bi_private = s->bi_private; + bio_endio(s->bio, 0); + + closure_debug_destroy(&s->cl); + mempool_free(s, s->p->bio_split_hook); +} + +static void bch_bio_submit_split_endio(struct bio *bio, int error) +{ + struct closure *cl = bio->bi_private; + struct bio_split_hook *s = container_of(cl, struct bio_split_hook, cl); + + if (error) + clear_bit(BIO_UPTODATE, &s->bio->bi_flags); + + bio_put(bio); + closure_put(cl); +} + +static void __bch_bio_submit_split(struct closure *cl) +{ + struct bio_split_hook *s = container_of(cl, struct bio_split_hook, cl); + struct bio *bio = s->bio, *n; + + do { + n = bch_bio_split(bio, bch_bio_max_sectors(bio), + GFP_NOIO, s->p->bio_split); + if (!n) + continue_at(cl, __bch_bio_submit_split, system_wq); + + n->bi_end_io = bch_bio_submit_split_endio; + n->bi_private = cl; + + closure_get(cl); + bch_generic_make_request_hack(n); + } while (n != bio); + + continue_at(cl, bch_bio_submit_split_done, NULL); +} + +void bch_generic_make_request(struct bio *bio, struct bio_split_pool *p) +{ + struct bio_split_hook *s; + + if (!bio_has_data(bio) && !(bio->bi_rw & REQ_DISCARD)) + goto submit; + + if (bio_sectors(bio) <= bch_bio_max_sectors(bio)) + goto submit; + + s = mempool_alloc(p->bio_split_hook, GFP_NOIO); + + s->bio = bio; + s->p = p; + s->bi_end_io = bio->bi_end_io; + s->bi_private = bio->bi_private; + bio_get(bio); + + closure_call(&s->cl, __bch_bio_submit_split, NULL, NULL); + return; +submit: + bch_generic_make_request_hack(bio); +} + +/* Bios with headers */ + +void bch_bbio_free(struct bio *bio, struct cache_set *c) +{ + struct bbio *b = container_of(bio, struct bbio, bio); + mempool_free(b, c->bio_meta); +} + +struct bio *bch_bbio_alloc(struct cache_set *c) +{ + struct bbio *b = mempool_alloc(c->bio_meta, GFP_NOIO); + struct bio *bio = &b->bio; + + bio_init(bio); + bio->bi_flags |= BIO_POOL_NONE << BIO_POOL_OFFSET; + bio->bi_max_vecs = bucket_pages(c); + bio->bi_io_vec = bio->bi_inline_vecs; + + return bio; +} + +void __bch_submit_bbio(struct bio *bio, struct cache_set *c) +{ + struct bbio *b = container_of(bio, struct bbio, bio); + + bio->bi_sector = PTR_OFFSET(&b->key, 0); + bio->bi_bdev = PTR_CACHE(c, &b->key, 0)->bdev; + + b->submit_time_us = local_clock_us(); + closure_bio_submit(bio, bio->bi_private, PTR_CACHE(c, &b->key, 0)); +} + +void bch_submit_bbio(struct bio *bio, struct cache_set *c, + struct bkey *k, unsigned ptr) +{ + struct bbio *b = container_of(bio, struct bbio, bio); + bch_bkey_copy_single_ptr(&b->key, k, ptr); + __bch_submit_bbio(bio, c); +} + +/* IO errors */ + +void bch_count_io_errors(struct cache *ca, int error, const char *m) +{ + /* + * The halflife of an error is: + * log2(1/2)/log2(127/128) * refresh ~= 88 * refresh + */ + + if (ca->set->error_decay) { + unsigned count = atomic_inc_return(&ca->io_count); + + while (count > ca->set->error_decay) { + unsigned errors; + unsigned old = count; + unsigned new = count - ca->set->error_decay; + + /* + * First we subtract refresh from count; each time we + * succesfully do so, we rescale the errors once: + */ + + count = atomic_cmpxchg(&ca->io_count, old, new); + + if (count == old) { + count = new; + + errors = atomic_read(&ca->io_errors); + do { + old = errors; + new = ((uint64_t) errors * 127) / 128; + errors = atomic_cmpxchg(&ca->io_errors, + old, new); + } while (old != errors); + } + } + } + + if (error) { + char buf[BDEVNAME_SIZE]; + unsigned errors = atomic_add_return(1 << IO_ERROR_SHIFT, + &ca->io_errors); + errors >>= IO_ERROR_SHIFT; + + if (errors < ca->set->error_limit) + pr_err("%s: IO error on %s, recovering", + bdevname(ca->bdev, buf), m); + else + bch_cache_set_error(ca->set, + "%s: too many IO errors %s", + bdevname(ca->bdev, buf), m); + } +} + +void bch_bbio_count_io_errors(struct cache_set *c, struct bio *bio, + int error, const char *m) +{ + struct bbio *b = container_of(bio, struct bbio, bio); + struct cache *ca = PTR_CACHE(c, &b->key, 0); + + unsigned threshold = bio->bi_rw & REQ_WRITE + ? c->congested_write_threshold_us + : c->congested_read_threshold_us; + + if (threshold) { + unsigned t = local_clock_us(); + + int us = t - b->submit_time_us; + int congested = atomic_read(&c->congested); + + if (us > (int) threshold) { + int ms = us / 1024; + c->congested_last_us = t; + + ms = min(ms, CONGESTED_MAX + congested); + atomic_sub(ms, &c->congested); + } else if (congested < 0) + atomic_inc(&c->congested); + } + + bch_count_io_errors(ca, error, m); +} + +void bch_bbio_endio(struct cache_set *c, struct bio *bio, + int error, const char *m) +{ + struct closure *cl = bio->bi_private; + + bch_bbio_count_io_errors(c, bio, error, m); + bio_put(bio); + closure_put(cl); +} diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c new file mode 100644 index 000000000000..8c8dfdcd9d4c --- /dev/null +++ b/drivers/md/bcache/journal.c @@ -0,0 +1,787 @@ +/* + * bcache journalling code, for btree insertions + * + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "btree.h" +#include "debug.h" +#include "request.h" + +/* + * Journal replay/recovery: + * + * This code is all driven from run_cache_set(); we first read the journal + * entries, do some other stuff, then we mark all the keys in the journal + * entries (same as garbage collection would), then we replay them - reinserting + * them into the cache in precisely the same order as they appear in the + * journal. + * + * We only journal keys that go in leaf nodes, which simplifies things quite a + * bit. + */ + +static void journal_read_endio(struct bio *bio, int error) +{ + struct closure *cl = bio->bi_private; + closure_put(cl); +} + +static int journal_read_bucket(struct cache *ca, struct list_head *list, + struct btree_op *op, unsigned bucket_index) +{ + struct journal_device *ja = &ca->journal; + struct bio *bio = &ja->bio; + + struct journal_replay *i; + struct jset *j, *data = ca->set->journal.w[0].data; + unsigned len, left, offset = 0; + int ret = 0; + sector_t bucket = bucket_to_sector(ca->set, ca->sb.d[bucket_index]); + + pr_debug("reading %llu", (uint64_t) bucket); + + while (offset < ca->sb.bucket_size) { +reread: left = ca->sb.bucket_size - offset; + len = min_t(unsigned, left, PAGE_SECTORS * 8); + + bio_reset(bio); + bio->bi_sector = bucket + offset; + bio->bi_bdev = ca->bdev; + bio->bi_rw = READ; + bio->bi_size = len << 9; + + bio->bi_end_io = journal_read_endio; + bio->bi_private = &op->cl; + bch_bio_map(bio, data); + + closure_bio_submit(bio, &op->cl, ca); + closure_sync(&op->cl); + + /* This function could be simpler now since we no longer write + * journal entries that overlap bucket boundaries; this means + * the start of a bucket will always have a valid journal entry + * if it has any journal entries at all. + */ + + j = data; + while (len) { + struct list_head *where; + size_t blocks, bytes = set_bytes(j); + + if (j->magic != jset_magic(ca->set)) + return ret; + + if (bytes > left << 9) + return ret; + + if (bytes > len << 9) + goto reread; + + if (j->csum != csum_set(j)) + return ret; + + blocks = set_blocks(j, ca->set); + + while (!list_empty(list)) { + i = list_first_entry(list, + struct journal_replay, list); + if (i->j.seq >= j->last_seq) + break; + list_del(&i->list); + kfree(i); + } + + list_for_each_entry_reverse(i, list, list) { + if (j->seq == i->j.seq) + goto next_set; + + if (j->seq < i->j.last_seq) + goto next_set; + + if (j->seq > i->j.seq) { + where = &i->list; + goto add; + } + } + + where = list; +add: + i = kmalloc(offsetof(struct journal_replay, j) + + bytes, GFP_KERNEL); + if (!i) + return -ENOMEM; + memcpy(&i->j, j, bytes); + list_add(&i->list, where); + ret = 1; + + ja->seq[bucket_index] = j->seq; +next_set: + offset += blocks * ca->sb.block_size; + len -= blocks * ca->sb.block_size; + j = ((void *) j) + blocks * block_bytes(ca); + } + } + + return ret; +} + +int bch_journal_read(struct cache_set *c, struct list_head *list, + struct btree_op *op) +{ +#define read_bucket(b) \ + ({ \ + int ret = journal_read_bucket(ca, list, op, b); \ + __set_bit(b, bitmap); \ + if (ret < 0) \ + return ret; \ + ret; \ + }) + + struct cache *ca; + unsigned iter; + + for_each_cache(ca, c, iter) { + struct journal_device *ja = &ca->journal; + unsigned long bitmap[SB_JOURNAL_BUCKETS / BITS_PER_LONG]; + unsigned i, l, r, m; + uint64_t seq; + + bitmap_zero(bitmap, SB_JOURNAL_BUCKETS); + pr_debug("%u journal buckets", ca->sb.njournal_buckets); + + /* Read journal buckets ordered by golden ratio hash to quickly + * find a sequence of buckets with valid journal entries + */ + for (i = 0; i < ca->sb.njournal_buckets; i++) { + l = (i * 2654435769U) % ca->sb.njournal_buckets; + + if (test_bit(l, bitmap)) + break; + + if (read_bucket(l)) + goto bsearch; + } + + /* If that fails, check all the buckets we haven't checked + * already + */ + pr_debug("falling back to linear search"); + + for (l = 0; l < ca->sb.njournal_buckets; l++) { + if (test_bit(l, bitmap)) + continue; + + if (read_bucket(l)) + goto bsearch; + } +bsearch: + /* Binary search */ + m = r = find_next_bit(bitmap, ca->sb.njournal_buckets, l + 1); + pr_debug("starting binary search, l %u r %u", l, r); + + while (l + 1 < r) { + m = (l + r) >> 1; + + if (read_bucket(m)) + l = m; + else + r = m; + } + + /* Read buckets in reverse order until we stop finding more + * journal entries + */ + pr_debug("finishing up"); + l = m; + + while (1) { + if (!l--) + l = ca->sb.njournal_buckets - 1; + + if (l == m) + break; + + if (test_bit(l, bitmap)) + continue; + + if (!read_bucket(l)) + break; + } + + seq = 0; + + for (i = 0; i < ca->sb.njournal_buckets; i++) + if (ja->seq[i] > seq) { + seq = ja->seq[i]; + ja->cur_idx = ja->discard_idx = + ja->last_idx = i; + + } + } + + c->journal.seq = list_entry(list->prev, + struct journal_replay, + list)->j.seq; + + return 0; +#undef read_bucket +} + +void bch_journal_mark(struct cache_set *c, struct list_head *list) +{ + atomic_t p = { 0 }; + struct bkey *k; + struct journal_replay *i; + struct journal *j = &c->journal; + uint64_t last = j->seq; + + /* + * journal.pin should never fill up - we never write a journal + * entry when it would fill up. But if for some reason it does, we + * iterate over the list in reverse order so that we can just skip that + * refcount instead of bugging. + */ + + list_for_each_entry_reverse(i, list, list) { + BUG_ON(last < i->j.seq); + i->pin = NULL; + + while (last-- != i->j.seq) + if (fifo_free(&j->pin) > 1) { + fifo_push_front(&j->pin, p); + atomic_set(&fifo_front(&j->pin), 0); + } + + if (fifo_free(&j->pin) > 1) { + fifo_push_front(&j->pin, p); + i->pin = &fifo_front(&j->pin); + atomic_set(i->pin, 1); + } + + for (k = i->j.start; + k < end(&i->j); + k = bkey_next(k)) { + unsigned j; + + for (j = 0; j < KEY_PTRS(k); j++) { + struct bucket *g = PTR_BUCKET(c, k, j); + atomic_inc(&g->pin); + + if (g->prio == BTREE_PRIO && + !ptr_stale(c, k, j)) + g->prio = INITIAL_PRIO; + } + + __bch_btree_mark_key(c, 0, k); + } + } +} + +int bch_journal_replay(struct cache_set *s, struct list_head *list, + struct btree_op *op) +{ + int ret = 0, keys = 0, entries = 0; + struct bkey *k; + struct journal_replay *i = + list_entry(list->prev, struct journal_replay, list); + + uint64_t start = i->j.last_seq, end = i->j.seq, n = start; + + list_for_each_entry(i, list, list) { + BUG_ON(i->pin && atomic_read(i->pin) != 1); + + if (n != i->j.seq) + pr_err( + "journal entries %llu-%llu missing! (replaying %llu-%llu)\n", + n, i->j.seq - 1, start, end); + + for (k = i->j.start; + k < end(&i->j); + k = bkey_next(k)) { + pr_debug("%s", pkey(k)); + bkey_copy(op->keys.top, k); + bch_keylist_push(&op->keys); + + op->journal = i->pin; + atomic_inc(op->journal); + + ret = bch_btree_insert(op, s); + if (ret) + goto err; + + BUG_ON(!bch_keylist_empty(&op->keys)); + keys++; + + cond_resched(); + } + + if (i->pin) + atomic_dec(i->pin); + n = i->j.seq + 1; + entries++; + } + + pr_info("journal replay done, %i keys in %i entries, seq %llu", + keys, entries, end); + + while (!list_empty(list)) { + i = list_first_entry(list, struct journal_replay, list); + list_del(&i->list); + kfree(i); + } +err: + closure_sync(&op->cl); + return ret; +} + +/* Journalling */ + +static void btree_flush_write(struct cache_set *c) +{ + /* + * Try to find the btree node with that references the oldest journal + * entry, best is our current candidate and is locked if non NULL: + */ + struct btree *b, *best = NULL; + unsigned iter; + + for_each_cached_btree(b, c, iter) { + if (!down_write_trylock(&b->lock)) + continue; + + if (!btree_node_dirty(b) || + !btree_current_write(b)->journal) { + rw_unlock(true, b); + continue; + } + + if (!best) + best = b; + else if (journal_pin_cmp(c, + btree_current_write(best), + btree_current_write(b))) { + rw_unlock(true, best); + best = b; + } else + rw_unlock(true, b); + } + + if (best) + goto out; + + /* We can't find the best btree node, just pick the first */ + list_for_each_entry(b, &c->btree_cache, list) + if (!b->level && btree_node_dirty(b)) { + best = b; + rw_lock(true, best, best->level); + goto found; + } + +out: + if (!best) + return; +found: + if (btree_node_dirty(best)) + bch_btree_write(best, true, NULL); + rw_unlock(true, best); +} + +#define last_seq(j) ((j)->seq - fifo_used(&(j)->pin) + 1) + +static void journal_discard_endio(struct bio *bio, int error) +{ + struct journal_device *ja = + container_of(bio, struct journal_device, discard_bio); + struct cache *ca = container_of(ja, struct cache, journal); + + atomic_set(&ja->discard_in_flight, DISCARD_DONE); + + closure_wake_up(&ca->set->journal.wait); + closure_put(&ca->set->cl); +} + +static void journal_discard_work(struct work_struct *work) +{ + struct journal_device *ja = + container_of(work, struct journal_device, discard_work); + + submit_bio(0, &ja->discard_bio); +} + +static void do_journal_discard(struct cache *ca) +{ + struct journal_device *ja = &ca->journal; + struct bio *bio = &ja->discard_bio; + + if (!ca->discard) { + ja->discard_idx = ja->last_idx; + return; + } + + switch (atomic_read(&ja->discard_in_flight) == DISCARD_IN_FLIGHT) { + case DISCARD_IN_FLIGHT: + return; + + case DISCARD_DONE: + ja->discard_idx = (ja->discard_idx + 1) % + ca->sb.njournal_buckets; + + atomic_set(&ja->discard_in_flight, DISCARD_READY); + /* fallthrough */ + + case DISCARD_READY: + if (ja->discard_idx == ja->last_idx) + return; + + atomic_set(&ja->discard_in_flight, DISCARD_IN_FLIGHT); + + bio_init(bio); + bio->bi_sector = bucket_to_sector(ca->set, + ca->sb.d[ja->discard_idx]); + bio->bi_bdev = ca->bdev; + bio->bi_rw = REQ_WRITE|REQ_DISCARD; + bio->bi_max_vecs = 1; + bio->bi_io_vec = bio->bi_inline_vecs; + bio->bi_size = bucket_bytes(ca); + bio->bi_end_io = journal_discard_endio; + + closure_get(&ca->set->cl); + INIT_WORK(&ja->discard_work, journal_discard_work); + schedule_work(&ja->discard_work); + } +} + +static void journal_reclaim(struct cache_set *c) +{ + struct bkey *k = &c->journal.key; + struct cache *ca; + uint64_t last_seq; + unsigned iter, n = 0; + atomic_t p; + + while (!atomic_read(&fifo_front(&c->journal.pin))) + fifo_pop(&c->journal.pin, p); + + last_seq = last_seq(&c->journal); + + /* Update last_idx */ + + for_each_cache(ca, c, iter) { + struct journal_device *ja = &ca->journal; + + while (ja->last_idx != ja->cur_idx && + ja->seq[ja->last_idx] < last_seq) + ja->last_idx = (ja->last_idx + 1) % + ca->sb.njournal_buckets; + } + + for_each_cache(ca, c, iter) + do_journal_discard(ca); + + if (c->journal.blocks_free) + return; + + /* + * Allocate: + * XXX: Sort by free journal space + */ + + for_each_cache(ca, c, iter) { + struct journal_device *ja = &ca->journal; + unsigned next = (ja->cur_idx + 1) % ca->sb.njournal_buckets; + + /* No space available on this device */ + if (next == ja->discard_idx) + continue; + + ja->cur_idx = next; + k->ptr[n++] = PTR(0, + bucket_to_sector(c, ca->sb.d[ja->cur_idx]), + ca->sb.nr_this_dev); + } + + bkey_init(k); + SET_KEY_PTRS(k, n); + + if (n) + c->journal.blocks_free = c->sb.bucket_size >> c->block_bits; + + if (!journal_full(&c->journal)) + __closure_wake_up(&c->journal.wait); +} + +void bch_journal_next(struct journal *j) +{ + atomic_t p = { 1 }; + + j->cur = (j->cur == j->w) + ? &j->w[1] + : &j->w[0]; + + /* + * The fifo_push() needs to happen at the same time as j->seq is + * incremented for last_seq() to be calculated correctly + */ + BUG_ON(!fifo_push(&j->pin, p)); + atomic_set(&fifo_back(&j->pin), 1); + + j->cur->data->seq = ++j->seq; + j->cur->need_write = false; + j->cur->data->keys = 0; + + if (fifo_full(&j->pin)) + pr_debug("journal_pin full (%zu)", fifo_used(&j->pin)); +} + +static void journal_write_endio(struct bio *bio, int error) +{ + struct journal_write *w = bio->bi_private; + + cache_set_err_on(error, w->c, "journal io error"); + closure_put(&w->c->journal.io.cl); +} + +static void journal_write(struct closure *); + +static void journal_write_done(struct closure *cl) +{ + struct journal *j = container_of(cl, struct journal, io.cl); + struct cache_set *c = container_of(j, struct cache_set, journal); + + struct journal_write *w = (j->cur == j->w) + ? &j->w[1] + : &j->w[0]; + + __closure_wake_up(&w->wait); + + if (c->journal_delay_ms) + closure_delay(&j->io, msecs_to_jiffies(c->journal_delay_ms)); + + continue_at(cl, journal_write, system_wq); +} + +static void journal_write_unlocked(struct closure *cl) + __releases(c->journal.lock) +{ + struct cache_set *c = container_of(cl, struct cache_set, journal.io.cl); + struct cache *ca; + struct journal_write *w = c->journal.cur; + struct bkey *k = &c->journal.key; + unsigned i, sectors = set_blocks(w->data, c) * c->sb.block_size; + + struct bio *bio; + struct bio_list list; + bio_list_init(&list); + + if (!w->need_write) { + /* + * XXX: have to unlock closure before we unlock journal lock, + * else we race with bch_journal(). But this way we race + * against cache set unregister. Doh. + */ + set_closure_fn(cl, NULL, NULL); + closure_sub(cl, CLOSURE_RUNNING + 1); + spin_unlock(&c->journal.lock); + return; + } else if (journal_full(&c->journal)) { + journal_reclaim(c); + spin_unlock(&c->journal.lock); + + btree_flush_write(c); + continue_at(cl, journal_write, system_wq); + } + + c->journal.blocks_free -= set_blocks(w->data, c); + + w->data->btree_level = c->root->level; + + bkey_copy(&w->data->btree_root, &c->root->key); + bkey_copy(&w->data->uuid_bucket, &c->uuid_bucket); + + for_each_cache(ca, c, i) + w->data->prio_bucket[ca->sb.nr_this_dev] = ca->prio_buckets[0]; + + w->data->magic = jset_magic(c); + w->data->version = BCACHE_JSET_VERSION; + w->data->last_seq = last_seq(&c->journal); + w->data->csum = csum_set(w->data); + + for (i = 0; i < KEY_PTRS(k); i++) { + ca = PTR_CACHE(c, k, i); + bio = &ca->journal.bio; + + atomic_long_add(sectors, &ca->meta_sectors_written); + + bio_reset(bio); + bio->bi_sector = PTR_OFFSET(k, i); + bio->bi_bdev = ca->bdev; + bio->bi_rw = REQ_WRITE|REQ_SYNC|REQ_META|REQ_FLUSH; + bio->bi_size = sectors << 9; + + bio->bi_end_io = journal_write_endio; + bio->bi_private = w; + bch_bio_map(bio, w->data); + + trace_bcache_journal_write(bio); + bio_list_add(&list, bio); + + SET_PTR_OFFSET(k, i, PTR_OFFSET(k, i) + sectors); + + ca->journal.seq[ca->journal.cur_idx] = w->data->seq; + } + + atomic_dec_bug(&fifo_back(&c->journal.pin)); + bch_journal_next(&c->journal); + journal_reclaim(c); + + spin_unlock(&c->journal.lock); + + while ((bio = bio_list_pop(&list))) + closure_bio_submit(bio, cl, c->cache[0]); + + continue_at(cl, journal_write_done, NULL); +} + +static void journal_write(struct closure *cl) +{ + struct cache_set *c = container_of(cl, struct cache_set, journal.io.cl); + + spin_lock(&c->journal.lock); + journal_write_unlocked(cl); +} + +static void __journal_try_write(struct cache_set *c, bool noflush) + __releases(c->journal.lock) +{ + struct closure *cl = &c->journal.io.cl; + + if (!closure_trylock(cl, &c->cl)) + spin_unlock(&c->journal.lock); + else if (noflush && journal_full(&c->journal)) { + spin_unlock(&c->journal.lock); + continue_at(cl, journal_write, system_wq); + } else + journal_write_unlocked(cl); +} + +#define journal_try_write(c) __journal_try_write(c, false) + +void bch_journal_meta(struct cache_set *c, struct closure *cl) +{ + struct journal_write *w; + + if (CACHE_SYNC(&c->sb)) { + spin_lock(&c->journal.lock); + + w = c->journal.cur; + w->need_write = true; + + if (cl) + BUG_ON(!closure_wait(&w->wait, cl)); + + __journal_try_write(c, true); + } +} + +/* + * Entry point to the journalling code - bio_insert() and btree_invalidate() + * pass bch_journal() a list of keys to be journalled, and then + * bch_journal() hands those same keys off to btree_insert_async() + */ + +void bch_journal(struct closure *cl) +{ + struct btree_op *op = container_of(cl, struct btree_op, cl); + struct cache_set *c = op->c; + struct journal_write *w; + size_t b, n = ((uint64_t *) op->keys.top) - op->keys.list; + + if (op->type != BTREE_INSERT || + !CACHE_SYNC(&c->sb)) + goto out; + + /* + * If we're looping because we errored, might already be waiting on + * another journal write: + */ + while (atomic_read(&cl->parent->remaining) & CLOSURE_WAITING) + closure_sync(cl->parent); + + spin_lock(&c->journal.lock); + + if (journal_full(&c->journal)) { + /* XXX: tracepoint */ + closure_wait(&c->journal.wait, cl); + + journal_reclaim(c); + spin_unlock(&c->journal.lock); + + btree_flush_write(c); + continue_at(cl, bch_journal, bcache_wq); + } + + w = c->journal.cur; + w->need_write = true; + b = __set_blocks(w->data, w->data->keys + n, c); + + if (b * c->sb.block_size > PAGE_SECTORS << JSET_BITS || + b > c->journal.blocks_free) { + /* XXX: If we were inserting so many keys that they won't fit in + * an _empty_ journal write, we'll deadlock. For now, handle + * this in bch_keylist_realloc() - but something to think about. + */ + BUG_ON(!w->data->keys); + + /* XXX: tracepoint */ + BUG_ON(!closure_wait(&w->wait, cl)); + + closure_flush(&c->journal.io); + + journal_try_write(c); + continue_at(cl, bch_journal, bcache_wq); + } + + memcpy(end(w->data), op->keys.list, n * sizeof(uint64_t)); + w->data->keys += n; + + op->journal = &fifo_back(&c->journal.pin); + atomic_inc(op->journal); + + if (op->flush_journal) { + closure_flush(&c->journal.io); + closure_wait(&w->wait, cl->parent); + } + + journal_try_write(c); +out: + bch_btree_insert_async(cl); +} + +void bch_journal_free(struct cache_set *c) +{ + free_pages((unsigned long) c->journal.w[1].data, JSET_BITS); + free_pages((unsigned long) c->journal.w[0].data, JSET_BITS); + free_fifo(&c->journal.pin); +} + +int bch_journal_alloc(struct cache_set *c) +{ + struct journal *j = &c->journal; + + closure_init_unlocked(&j->io); + spin_lock_init(&j->lock); + + c->journal_delay_ms = 100; + + j->w[0].c = c; + j->w[1].c = c; + + if (!(init_fifo(&j->pin, JOURNAL_PIN, GFP_KERNEL)) || + !(j->w[0].data = (void *) __get_free_pages(GFP_KERNEL, JSET_BITS)) || + !(j->w[1].data = (void *) __get_free_pages(GFP_KERNEL, JSET_BITS))) + return -ENOMEM; + + return 0; +} diff --git a/drivers/md/bcache/journal.h b/drivers/md/bcache/journal.h new file mode 100644 index 000000000000..3d7851274b04 --- /dev/null +++ b/drivers/md/bcache/journal.h @@ -0,0 +1,215 @@ +#ifndef _BCACHE_JOURNAL_H +#define _BCACHE_JOURNAL_H + +/* + * THE JOURNAL: + * + * The journal is treated as a circular buffer of buckets - a journal entry + * never spans two buckets. This means (not implemented yet) we can resize the + * journal at runtime, and will be needed for bcache on raw flash support. + * + * Journal entries contain a list of keys, ordered by the time they were + * inserted; thus journal replay just has to reinsert the keys. + * + * We also keep some things in the journal header that are logically part of the + * superblock - all the things that are frequently updated. This is for future + * bcache on raw flash support; the superblock (which will become another + * journal) can't be moved or wear leveled, so it contains just enough + * information to find the main journal, and the superblock only has to be + * rewritten when we want to move/wear level the main journal. + * + * Currently, we don't journal BTREE_REPLACE operations - this will hopefully be + * fixed eventually. This isn't a bug - BTREE_REPLACE is used for insertions + * from cache misses, which don't have to be journaled, and for writeback and + * moving gc we work around it by flushing the btree to disk before updating the + * gc information. But it is a potential issue with incremental garbage + * collection, and it's fragile. + * + * OPEN JOURNAL ENTRIES: + * + * Each journal entry contains, in the header, the sequence number of the last + * journal entry still open - i.e. that has keys that haven't been flushed to + * disk in the btree. + * + * We track this by maintaining a refcount for every open journal entry, in a + * fifo; each entry in the fifo corresponds to a particular journal + * entry/sequence number. When the refcount at the tail of the fifo goes to + * zero, we pop it off - thus, the size of the fifo tells us the number of open + * journal entries + * + * We take a refcount on a journal entry when we add some keys to a journal + * entry that we're going to insert (held by struct btree_op), and then when we + * insert those keys into the btree the btree write we're setting up takes a + * copy of that refcount (held by struct btree_write). That refcount is dropped + * when the btree write completes. + * + * A struct btree_write can only hold a refcount on a single journal entry, but + * might contain keys for many journal entries - we handle this by making sure + * it always has a refcount on the _oldest_ journal entry of all the journal + * entries it has keys for. + * + * JOURNAL RECLAIM: + * + * As mentioned previously, our fifo of refcounts tells us the number of open + * journal entries; from that and the current journal sequence number we compute + * last_seq - the oldest journal entry we still need. We write last_seq in each + * journal entry, and we also have to keep track of where it exists on disk so + * we don't overwrite it when we loop around the journal. + * + * To do that we track, for each journal bucket, the sequence number of the + * newest journal entry it contains - if we don't need that journal entry we + * don't need anything in that bucket anymore. From that we track the last + * journal bucket we still need; all this is tracked in struct journal_device + * and updated by journal_reclaim(). + * + * JOURNAL FILLING UP: + * + * There are two ways the journal could fill up; either we could run out of + * space to write to, or we could have too many open journal entries and run out + * of room in the fifo of refcounts. Since those refcounts are decremented + * without any locking we can't safely resize that fifo, so we handle it the + * same way. + * + * If the journal fills up, we start flushing dirty btree nodes until we can + * allocate space for a journal write again - preferentially flushing btree + * nodes that are pinning the oldest journal entries first. + */ + +#define BCACHE_JSET_VERSION_UUIDv1 1 +/* Always latest UUID format */ +#define BCACHE_JSET_VERSION_UUID 1 +#define BCACHE_JSET_VERSION 1 + +/* + * On disk format for a journal entry: + * seq is monotonically increasing; every journal entry has its own unique + * sequence number. + * + * last_seq is the oldest journal entry that still has keys the btree hasn't + * flushed to disk yet. + * + * version is for on disk format changes. + */ +struct jset { + uint64_t csum; + uint64_t magic; + uint64_t seq; + uint32_t version; + uint32_t keys; + + uint64_t last_seq; + + BKEY_PADDED(uuid_bucket); + BKEY_PADDED(btree_root); + uint16_t btree_level; + uint16_t pad[3]; + + uint64_t prio_bucket[MAX_CACHES_PER_SET]; + + union { + struct bkey start[0]; + uint64_t d[0]; + }; +}; + +/* + * Only used for holding the journal entries we read in btree_journal_read() + * during cache_registration + */ +struct journal_replay { + struct list_head list; + atomic_t *pin; + struct jset j; +}; + +/* + * We put two of these in struct journal; we used them for writes to the + * journal that are being staged or in flight. + */ +struct journal_write { + struct jset *data; +#define JSET_BITS 3 + + struct cache_set *c; + struct closure_waitlist wait; + bool need_write; +}; + +/* Embedded in struct cache_set */ +struct journal { + spinlock_t lock; + /* used when waiting because the journal was full */ + struct closure_waitlist wait; + struct closure_with_timer io; + + /* Number of blocks free in the bucket(s) we're currently writing to */ + unsigned blocks_free; + uint64_t seq; + DECLARE_FIFO(atomic_t, pin); + + BKEY_PADDED(key); + + struct journal_write w[2], *cur; +}; + +/* + * Embedded in struct cache. First three fields refer to the array of journal + * buckets, in cache_sb. + */ +struct journal_device { + /* + * For each journal bucket, contains the max sequence number of the + * journal writes it contains - so we know when a bucket can be reused. + */ + uint64_t seq[SB_JOURNAL_BUCKETS]; + + /* Journal bucket we're currently writing to */ + unsigned cur_idx; + + /* Last journal bucket that still contains an open journal entry */ + unsigned last_idx; + + /* Next journal bucket to be discarded */ + unsigned discard_idx; + +#define DISCARD_READY 0 +#define DISCARD_IN_FLIGHT 1 +#define DISCARD_DONE 2 + /* 1 - discard in flight, -1 - discard completed */ + atomic_t discard_in_flight; + + struct work_struct discard_work; + struct bio discard_bio; + struct bio_vec discard_bv; + + /* Bio for journal reads/writes to this device */ + struct bio bio; + struct bio_vec bv[8]; +}; + +#define journal_pin_cmp(c, l, r) \ + (fifo_idx(&(c)->journal.pin, (l)->journal) > \ + fifo_idx(&(c)->journal.pin, (r)->journal)) + +#define JOURNAL_PIN 20000 + +#define journal_full(j) \ + (!(j)->blocks_free || fifo_free(&(j)->pin) <= 1) + +struct closure; +struct cache_set; +struct btree_op; + +void bch_journal(struct closure *); +void bch_journal_next(struct journal *); +void bch_journal_mark(struct cache_set *, struct list_head *); +void bch_journal_meta(struct cache_set *, struct closure *); +int bch_journal_read(struct cache_set *, struct list_head *, + struct btree_op *); +int bch_journal_replay(struct cache_set *, struct list_head *, + struct btree_op *); + +void bch_journal_free(struct cache_set *); +int bch_journal_alloc(struct cache_set *); + +#endif /* _BCACHE_JOURNAL_H */ diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c new file mode 100644 index 000000000000..8589512c972e --- /dev/null +++ b/drivers/md/bcache/movinggc.c @@ -0,0 +1,254 @@ +/* + * Moving/copying garbage collector + * + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "btree.h" +#include "debug.h" +#include "request.h" + +struct moving_io { + struct keybuf_key *w; + struct search s; + struct bbio bio; +}; + +static bool moving_pred(struct keybuf *buf, struct bkey *k) +{ + struct cache_set *c = container_of(buf, struct cache_set, + moving_gc_keys); + unsigned i; + + for (i = 0; i < KEY_PTRS(k); i++) { + struct cache *ca = PTR_CACHE(c, k, i); + struct bucket *g = PTR_BUCKET(c, k, i); + + if (GC_SECTORS_USED(g) < ca->gc_move_threshold) + return true; + } + + return false; +} + +/* Moving GC - IO loop */ + +static void moving_io_destructor(struct closure *cl) +{ + struct moving_io *io = container_of(cl, struct moving_io, s.cl); + kfree(io); +} + +static void write_moving_finish(struct closure *cl) +{ + struct moving_io *io = container_of(cl, struct moving_io, s.cl); + struct bio *bio = &io->bio.bio; + struct bio_vec *bv = bio_iovec_idx(bio, bio->bi_vcnt); + + while (bv-- != bio->bi_io_vec) + __free_page(bv->bv_page); + + pr_debug("%s %s", io->s.op.insert_collision + ? "collision moving" : "moved", + pkey(&io->w->key)); + + bch_keybuf_del(&io->s.op.c->moving_gc_keys, io->w); + + atomic_dec_bug(&io->s.op.c->in_flight); + closure_wake_up(&io->s.op.c->moving_gc_wait); + + closure_return_with_destructor(cl, moving_io_destructor); +} + +static void read_moving_endio(struct bio *bio, int error) +{ + struct moving_io *io = container_of(bio->bi_private, + struct moving_io, s.cl); + + if (error) + io->s.error = error; + + bch_bbio_endio(io->s.op.c, bio, error, "reading data to move"); +} + +static void moving_init(struct moving_io *io) +{ + struct bio *bio = &io->bio.bio; + + bio_init(bio); + bio_get(bio); + bio_set_prio(bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0)); + + bio->bi_size = KEY_SIZE(&io->w->key) << 9; + bio->bi_max_vecs = DIV_ROUND_UP(KEY_SIZE(&io->w->key), + PAGE_SECTORS); + bio->bi_private = &io->s.cl; + bio->bi_io_vec = bio->bi_inline_vecs; + bch_bio_map(bio, NULL); +} + +static void write_moving(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + struct moving_io *io = container_of(s, struct moving_io, s); + + if (!s->error) { + trace_bcache_write_moving(&io->bio.bio); + + moving_init(io); + + io->bio.bio.bi_sector = KEY_START(&io->w->key); + s->op.lock = -1; + s->op.write_prio = 1; + s->op.cache_bio = &io->bio.bio; + + s->writeback = KEY_DIRTY(&io->w->key); + s->op.csum = KEY_CSUM(&io->w->key); + + s->op.type = BTREE_REPLACE; + bkey_copy(&s->op.replace, &io->w->key); + + closure_init(&s->op.cl, cl); + bch_insert_data(&s->op.cl); + } + + continue_at(cl, write_moving_finish, NULL); +} + +static void read_moving_submit(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + struct moving_io *io = container_of(s, struct moving_io, s); + struct bio *bio = &io->bio.bio; + + trace_bcache_read_moving(bio); + bch_submit_bbio(bio, s->op.c, &io->w->key, 0); + + continue_at(cl, write_moving, bch_gc_wq); +} + +static void read_moving(struct closure *cl) +{ + struct cache_set *c = container_of(cl, struct cache_set, moving_gc); + struct keybuf_key *w; + struct moving_io *io; + struct bio *bio; + + /* XXX: if we error, background writeback could stall indefinitely */ + + while (!test_bit(CACHE_SET_STOPPING, &c->flags)) { + w = bch_keybuf_next_rescan(c, &c->moving_gc_keys, &MAX_KEY); + if (!w) + break; + + io = kzalloc(sizeof(struct moving_io) + sizeof(struct bio_vec) + * DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS), + GFP_KERNEL); + if (!io) + goto err; + + w->private = io; + io->w = w; + io->s.op.inode = KEY_INODE(&w->key); + io->s.op.c = c; + + moving_init(io); + bio = &io->bio.bio; + + bio->bi_rw = READ; + bio->bi_end_io = read_moving_endio; + + if (bch_bio_alloc_pages(bio, GFP_KERNEL)) + goto err; + + pr_debug("%s", pkey(&w->key)); + + closure_call(&io->s.cl, read_moving_submit, NULL, &c->gc.cl); + + if (atomic_inc_return(&c->in_flight) >= 64) { + closure_wait_event(&c->moving_gc_wait, cl, + atomic_read(&c->in_flight) < 64); + continue_at(cl, read_moving, bch_gc_wq); + } + } + + if (0) { +err: if (!IS_ERR_OR_NULL(w->private)) + kfree(w->private); + + bch_keybuf_del(&c->moving_gc_keys, w); + } + + closure_return(cl); +} + +static bool bucket_cmp(struct bucket *l, struct bucket *r) +{ + return GC_SECTORS_USED(l) < GC_SECTORS_USED(r); +} + +static unsigned bucket_heap_top(struct cache *ca) +{ + return GC_SECTORS_USED(heap_peek(&ca->heap)); +} + +void bch_moving_gc(struct closure *cl) +{ + struct cache_set *c = container_of(cl, struct cache_set, gc.cl); + struct cache *ca; + struct bucket *b; + unsigned i; + + if (!c->copy_gc_enabled) + closure_return(cl); + + mutex_lock(&c->bucket_lock); + + for_each_cache(ca, c, i) { + unsigned sectors_to_move = 0; + unsigned reserve_sectors = ca->sb.bucket_size * + min(fifo_used(&ca->free), ca->free.size / 2); + + ca->heap.used = 0; + + for_each_bucket(b, ca) { + if (!GC_SECTORS_USED(b)) + continue; + + if (!heap_full(&ca->heap)) { + sectors_to_move += GC_SECTORS_USED(b); + heap_add(&ca->heap, b, bucket_cmp); + } else if (bucket_cmp(b, heap_peek(&ca->heap))) { + sectors_to_move -= bucket_heap_top(ca); + sectors_to_move += GC_SECTORS_USED(b); + + ca->heap.data[0] = b; + heap_sift(&ca->heap, 0, bucket_cmp); + } + } + + while (sectors_to_move > reserve_sectors) { + heap_pop(&ca->heap, b, bucket_cmp); + sectors_to_move -= GC_SECTORS_USED(b); + } + + ca->gc_move_threshold = bucket_heap_top(ca); + + pr_debug("threshold %u", ca->gc_move_threshold); + } + + mutex_unlock(&c->bucket_lock); + + c->moving_gc_keys.last_scanned = ZERO_KEY; + + closure_init(&c->moving_gc, cl); + read_moving(&c->moving_gc); + + closure_return(cl); +} + +void bch_moving_init_cache_set(struct cache_set *c) +{ + bch_keybuf_init(&c->moving_gc_keys, moving_pred); +} diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c new file mode 100644 index 000000000000..e5ff12e52d5b --- /dev/null +++ b/drivers/md/bcache/request.c @@ -0,0 +1,1411 @@ +/* + * Main bcache entry point - handle a read or a write request and decide what to + * do with it; the make_request functions are called by the block layer. + * + * Copyright 2010, 2011 Kent Overstreet + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "btree.h" +#include "debug.h" +#include "request.h" + +#include +#include +#include +#include +#include "blk-cgroup.h" + +#include + +#define CUTOFF_CACHE_ADD 95 +#define CUTOFF_CACHE_READA 90 +#define CUTOFF_WRITEBACK 50 +#define CUTOFF_WRITEBACK_SYNC 75 + +struct kmem_cache *bch_search_cache; + +static void check_should_skip(struct cached_dev *, struct search *); + +/* Cgroup interface */ + +#ifdef CONFIG_CGROUP_BCACHE +static struct bch_cgroup bcache_default_cgroup = { .cache_mode = -1 }; + +static struct bch_cgroup *cgroup_to_bcache(struct cgroup *cgroup) +{ + struct cgroup_subsys_state *css; + return cgroup && + (css = cgroup_subsys_state(cgroup, bcache_subsys_id)) + ? container_of(css, struct bch_cgroup, css) + : &bcache_default_cgroup; +} + +struct bch_cgroup *bch_bio_to_cgroup(struct bio *bio) +{ + struct cgroup_subsys_state *css = bio->bi_css + ? cgroup_subsys_state(bio->bi_css->cgroup, bcache_subsys_id) + : task_subsys_state(current, bcache_subsys_id); + + return css + ? container_of(css, struct bch_cgroup, css) + : &bcache_default_cgroup; +} + +static ssize_t cache_mode_read(struct cgroup *cgrp, struct cftype *cft, + struct file *file, + char __user *buf, size_t nbytes, loff_t *ppos) +{ + char tmp[1024]; + int len = bch_snprint_string_list(tmp, PAGE_SIZE, bch_cache_modes, + cgroup_to_bcache(cgrp)->cache_mode + 1); + + if (len < 0) + return len; + + return simple_read_from_buffer(buf, nbytes, ppos, tmp, len); +} + +static int cache_mode_write(struct cgroup *cgrp, struct cftype *cft, + const char *buf) +{ + int v = bch_read_string_list(buf, bch_cache_modes); + if (v < 0) + return v; + + cgroup_to_bcache(cgrp)->cache_mode = v - 1; + return 0; +} + +static u64 bch_verify_read(struct cgroup *cgrp, struct cftype *cft) +{ + return cgroup_to_bcache(cgrp)->verify; +} + +static int bch_verify_write(struct cgroup *cgrp, struct cftype *cft, u64 val) +{ + cgroup_to_bcache(cgrp)->verify = val; + return 0; +} + +static u64 bch_cache_hits_read(struct cgroup *cgrp, struct cftype *cft) +{ + struct bch_cgroup *bcachecg = cgroup_to_bcache(cgrp); + return atomic_read(&bcachecg->stats.cache_hits); +} + +static u64 bch_cache_misses_read(struct cgroup *cgrp, struct cftype *cft) +{ + struct bch_cgroup *bcachecg = cgroup_to_bcache(cgrp); + return atomic_read(&bcachecg->stats.cache_misses); +} + +static u64 bch_cache_bypass_hits_read(struct cgroup *cgrp, + struct cftype *cft) +{ + struct bch_cgroup *bcachecg = cgroup_to_bcache(cgrp); + return atomic_read(&bcachecg->stats.cache_bypass_hits); +} + +static u64 bch_cache_bypass_misses_read(struct cgroup *cgrp, + struct cftype *cft) +{ + struct bch_cgroup *bcachecg = cgroup_to_bcache(cgrp); + return atomic_read(&bcachecg->stats.cache_bypass_misses); +} + +static struct cftype bch_files[] = { + { + .name = "cache_mode", + .read = cache_mode_read, + .write_string = cache_mode_write, + }, + { + .name = "verify", + .read_u64 = bch_verify_read, + .write_u64 = bch_verify_write, + }, + { + .name = "cache_hits", + .read_u64 = bch_cache_hits_read, + }, + { + .name = "cache_misses", + .read_u64 = bch_cache_misses_read, + }, + { + .name = "cache_bypass_hits", + .read_u64 = bch_cache_bypass_hits_read, + }, + { + .name = "cache_bypass_misses", + .read_u64 = bch_cache_bypass_misses_read, + }, + { } /* terminate */ +}; + +static void init_bch_cgroup(struct bch_cgroup *cg) +{ + cg->cache_mode = -1; +} + +static struct cgroup_subsys_state *bcachecg_create(struct cgroup *cgroup) +{ + struct bch_cgroup *cg; + + cg = kzalloc(sizeof(*cg), GFP_KERNEL); + if (!cg) + return ERR_PTR(-ENOMEM); + init_bch_cgroup(cg); + return &cg->css; +} + +static void bcachecg_destroy(struct cgroup *cgroup) +{ + struct bch_cgroup *cg = cgroup_to_bcache(cgroup); + free_css_id(&bcache_subsys, &cg->css); + kfree(cg); +} + +struct cgroup_subsys bcache_subsys = { + .create = bcachecg_create, + .destroy = bcachecg_destroy, + .subsys_id = bcache_subsys_id, + .name = "bcache", + .module = THIS_MODULE, +}; +EXPORT_SYMBOL_GPL(bcache_subsys); +#endif + +static unsigned cache_mode(struct cached_dev *dc, struct bio *bio) +{ +#ifdef CONFIG_CGROUP_BCACHE + int r = bch_bio_to_cgroup(bio)->cache_mode; + if (r >= 0) + return r; +#endif + return BDEV_CACHE_MODE(&dc->sb); +} + +static bool verify(struct cached_dev *dc, struct bio *bio) +{ +#ifdef CONFIG_CGROUP_BCACHE + if (bch_bio_to_cgroup(bio)->verify) + return true; +#endif + return dc->verify; +} + +static void bio_csum(struct bio *bio, struct bkey *k) +{ + struct bio_vec *bv; + uint64_t csum = 0; + int i; + + bio_for_each_segment(bv, bio, i) { + void *d = kmap(bv->bv_page) + bv->bv_offset; + csum = bch_crc64_update(csum, d, bv->bv_len); + kunmap(bv->bv_page); + } + + k->ptr[KEY_PTRS(k)] = csum & (~0ULL >> 1); +} + +/* Insert data into cache */ + +static void bio_invalidate(struct closure *cl) +{ + struct btree_op *op = container_of(cl, struct btree_op, cl); + struct bio *bio = op->cache_bio; + + pr_debug("invalidating %i sectors from %llu", + bio_sectors(bio), (uint64_t) bio->bi_sector); + + while (bio_sectors(bio)) { + unsigned len = min(bio_sectors(bio), 1U << 14); + + if (bch_keylist_realloc(&op->keys, 0, op->c)) + goto out; + + bio->bi_sector += len; + bio->bi_size -= len << 9; + + bch_keylist_add(&op->keys, + &KEY(op->inode, bio->bi_sector, len)); + } + + op->insert_data_done = true; + bio_put(bio); +out: + continue_at(cl, bch_journal, bcache_wq); +} + +struct open_bucket { + struct list_head list; + struct task_struct *last; + unsigned sectors_free; + BKEY_PADDED(key); +}; + +void bch_open_buckets_free(struct cache_set *c) +{ + struct open_bucket *b; + + while (!list_empty(&c->data_buckets)) { + b = list_first_entry(&c->data_buckets, + struct open_bucket, list); + list_del(&b->list); + kfree(b); + } +} + +int bch_open_buckets_alloc(struct cache_set *c) +{ + int i; + + spin_lock_init(&c->data_bucket_lock); + + for (i = 0; i < 6; i++) { + struct open_bucket *b = kzalloc(sizeof(*b), GFP_KERNEL); + if (!b) + return -ENOMEM; + + list_add(&b->list, &c->data_buckets); + } + + return 0; +} + +/* + * We keep multiple buckets open for writes, and try to segregate different + * write streams for better cache utilization: first we look for a bucket where + * the last write to it was sequential with the current write, and failing that + * we look for a bucket that was last used by the same task. + * + * The ideas is if you've got multiple tasks pulling data into the cache at the + * same time, you'll get better cache utilization if you try to segregate their + * data and preserve locality. + * + * For example, say you've starting Firefox at the same time you're copying a + * bunch of files. Firefox will likely end up being fairly hot and stay in the + * cache awhile, but the data you copied might not be; if you wrote all that + * data to the same buckets it'd get invalidated at the same time. + * + * Both of those tasks will be doing fairly random IO so we can't rely on + * detecting sequential IO to segregate their data, but going off of the task + * should be a sane heuristic. + */ +static struct open_bucket *pick_data_bucket(struct cache_set *c, + const struct bkey *search, + struct task_struct *task, + struct bkey *alloc) +{ + struct open_bucket *ret, *ret_task = NULL; + + list_for_each_entry_reverse(ret, &c->data_buckets, list) + if (!bkey_cmp(&ret->key, search)) + goto found; + else if (ret->last == task) + ret_task = ret; + + ret = ret_task ?: list_first_entry(&c->data_buckets, + struct open_bucket, list); +found: + if (!ret->sectors_free && KEY_PTRS(alloc)) { + ret->sectors_free = c->sb.bucket_size; + bkey_copy(&ret->key, alloc); + bkey_init(alloc); + } + + if (!ret->sectors_free) + ret = NULL; + + return ret; +} + +/* + * Allocates some space in the cache to write to, and k to point to the newly + * allocated space, and updates KEY_SIZE(k) and KEY_OFFSET(k) (to point to the + * end of the newly allocated space). + * + * May allocate fewer sectors than @sectors, KEY_SIZE(k) indicates how many + * sectors were actually allocated. + * + * If s->writeback is true, will not fail. + */ +static bool bch_alloc_sectors(struct bkey *k, unsigned sectors, + struct search *s) +{ + struct cache_set *c = s->op.c; + struct open_bucket *b; + BKEY_PADDED(key) alloc; + struct closure cl, *w = NULL; + unsigned i; + + if (s->writeback) { + closure_init_stack(&cl); + w = &cl; + } + + /* + * We might have to allocate a new bucket, which we can't do with a + * spinlock held. So if we have to allocate, we drop the lock, allocate + * and then retry. KEY_PTRS() indicates whether alloc points to + * allocated bucket(s). + */ + + bkey_init(&alloc.key); + spin_lock(&c->data_bucket_lock); + + while (!(b = pick_data_bucket(c, k, s->task, &alloc.key))) { + unsigned watermark = s->op.write_prio + ? WATERMARK_MOVINGGC + : WATERMARK_NONE; + + spin_unlock(&c->data_bucket_lock); + + if (bch_bucket_alloc_set(c, watermark, &alloc.key, 1, w)) + return false; + + spin_lock(&c->data_bucket_lock); + } + + /* + * If we had to allocate, we might race and not need to allocate the + * second time we call find_data_bucket(). If we allocated a bucket but + * didn't use it, drop the refcount bch_bucket_alloc_set() took: + */ + if (KEY_PTRS(&alloc.key)) + __bkey_put(c, &alloc.key); + + for (i = 0; i < KEY_PTRS(&b->key); i++) + EBUG_ON(ptr_stale(c, &b->key, i)); + + /* Set up the pointer to the space we're allocating: */ + + for (i = 0; i < KEY_PTRS(&b->key); i++) + k->ptr[i] = b->key.ptr[i]; + + sectors = min(sectors, b->sectors_free); + + SET_KEY_OFFSET(k, KEY_OFFSET(k) + sectors); + SET_KEY_SIZE(k, sectors); + SET_KEY_PTRS(k, KEY_PTRS(&b->key)); + + /* + * Move b to the end of the lru, and keep track of what this bucket was + * last used for: + */ + list_move_tail(&b->list, &c->data_buckets); + bkey_copy_key(&b->key, k); + b->last = s->task; + + b->sectors_free -= sectors; + + for (i = 0; i < KEY_PTRS(&b->key); i++) { + SET_PTR_OFFSET(&b->key, i, PTR_OFFSET(&b->key, i) + sectors); + + atomic_long_add(sectors, + &PTR_CACHE(c, &b->key, i)->sectors_written); + } + + if (b->sectors_free < c->sb.block_size) + b->sectors_free = 0; + + /* + * k takes refcounts on the buckets it points to until it's inserted + * into the btree, but if we're done with this bucket we just transfer + * get_data_bucket()'s refcount. + */ + if (b->sectors_free) + for (i = 0; i < KEY_PTRS(&b->key); i++) + atomic_inc(&PTR_BUCKET(c, &b->key, i)->pin); + + spin_unlock(&c->data_bucket_lock); + return true; +} + +static void bch_insert_data_error(struct closure *cl) +{ + struct btree_op *op = container_of(cl, struct btree_op, cl); + + /* + * Our data write just errored, which means we've got a bunch of keys to + * insert that point to data that wasn't succesfully written. + * + * We don't have to insert those keys but we still have to invalidate + * that region of the cache - so, if we just strip off all the pointers + * from the keys we'll accomplish just that. + */ + + struct bkey *src = op->keys.bottom, *dst = op->keys.bottom; + + while (src != op->keys.top) { + struct bkey *n = bkey_next(src); + + SET_KEY_PTRS(src, 0); + bkey_copy(dst, src); + + dst = bkey_next(dst); + src = n; + } + + op->keys.top = dst; + + bch_journal(cl); +} + +static void bch_insert_data_endio(struct bio *bio, int error) +{ + struct closure *cl = bio->bi_private; + struct btree_op *op = container_of(cl, struct btree_op, cl); + struct search *s = container_of(op, struct search, op); + + if (error) { + /* TODO: We could try to recover from this. */ + if (s->writeback) + s->error = error; + else if (s->write) + set_closure_fn(cl, bch_insert_data_error, bcache_wq); + else + set_closure_fn(cl, NULL, NULL); + } + + bch_bbio_endio(op->c, bio, error, "writing data to cache"); +} + +static void bch_insert_data_loop(struct closure *cl) +{ + struct btree_op *op = container_of(cl, struct btree_op, cl); + struct search *s = container_of(op, struct search, op); + struct bio *bio = op->cache_bio, *n; + + if (op->skip) + return bio_invalidate(cl); + + if (atomic_sub_return(bio_sectors(bio), &op->c->sectors_to_gc) < 0) { + set_gc_sectors(op->c); + bch_queue_gc(op->c); + } + + do { + unsigned i; + struct bkey *k; + struct bio_set *split = s->d + ? s->d->bio_split : op->c->bio_split; + + /* 1 for the device pointer and 1 for the chksum */ + if (bch_keylist_realloc(&op->keys, + 1 + (op->csum ? 1 : 0), + op->c)) + continue_at(cl, bch_journal, bcache_wq); + + k = op->keys.top; + bkey_init(k); + SET_KEY_INODE(k, op->inode); + SET_KEY_OFFSET(k, bio->bi_sector); + + if (!bch_alloc_sectors(k, bio_sectors(bio), s)) + goto err; + + n = bch_bio_split(bio, KEY_SIZE(k), GFP_NOIO, split); + if (!n) { + __bkey_put(op->c, k); + continue_at(cl, bch_insert_data_loop, bcache_wq); + } + + n->bi_end_io = bch_insert_data_endio; + n->bi_private = cl; + + if (s->writeback) { + SET_KEY_DIRTY(k, true); + + for (i = 0; i < KEY_PTRS(k); i++) + SET_GC_MARK(PTR_BUCKET(op->c, k, i), + GC_MARK_DIRTY); + } + + SET_KEY_CSUM(k, op->csum); + if (KEY_CSUM(k)) + bio_csum(n, k); + + pr_debug("%s", pkey(k)); + bch_keylist_push(&op->keys); + + trace_bcache_cache_insert(n, n->bi_sector, n->bi_bdev); + n->bi_rw |= REQ_WRITE; + bch_submit_bbio(n, op->c, k, 0); + } while (n != bio); + + op->insert_data_done = true; + continue_at(cl, bch_journal, bcache_wq); +err: + /* bch_alloc_sectors() blocks if s->writeback = true */ + BUG_ON(s->writeback); + + /* + * But if it's not a writeback write we'd rather just bail out if + * there aren't any buckets ready to write to - it might take awhile and + * we might be starving btree writes for gc or something. + */ + + if (s->write) { + /* + * Writethrough write: We can't complete the write until we've + * updated the index. But we don't want to delay the write while + * we wait for buckets to be freed up, so just invalidate the + * rest of the write. + */ + op->skip = true; + return bio_invalidate(cl); + } else { + /* + * From a cache miss, we can just insert the keys for the data + * we have written or bail out if we didn't do anything. + */ + op->insert_data_done = true; + bio_put(bio); + + if (!bch_keylist_empty(&op->keys)) + continue_at(cl, bch_journal, bcache_wq); + else + closure_return(cl); + } +} + +/** + * bch_insert_data - stick some data in the cache + * + * This is the starting point for any data to end up in a cache device; it could + * be from a normal write, or a writeback write, or a write to a flash only + * volume - it's also used by the moving garbage collector to compact data in + * mostly empty buckets. + * + * It first writes the data to the cache, creating a list of keys to be inserted + * (if the data had to be fragmented there will be multiple keys); after the + * data is written it calls bch_journal, and after the keys have been added to + * the next journal write they're inserted into the btree. + * + * It inserts the data in op->cache_bio; bi_sector is used for the key offset, + * and op->inode is used for the key inode. + * + * If op->skip is true, instead of inserting the data it invalidates the region + * of the cache represented by op->cache_bio and op->inode. + */ +void bch_insert_data(struct closure *cl) +{ + struct btree_op *op = container_of(cl, struct btree_op, cl); + + bch_keylist_init(&op->keys); + bio_get(op->cache_bio); + bch_insert_data_loop(cl); +} + +void bch_btree_insert_async(struct closure *cl) +{ + struct btree_op *op = container_of(cl, struct btree_op, cl); + struct search *s = container_of(op, struct search, op); + + if (bch_btree_insert(op, op->c)) { + s->error = -ENOMEM; + op->insert_data_done = true; + } + + if (op->insert_data_done) { + bch_keylist_free(&op->keys); + closure_return(cl); + } else + continue_at(cl, bch_insert_data_loop, bcache_wq); +} + +/* Common code for the make_request functions */ + +static void request_endio(struct bio *bio, int error) +{ + struct closure *cl = bio->bi_private; + + if (error) { + struct search *s = container_of(cl, struct search, cl); + s->error = error; + /* Only cache read errors are recoverable */ + s->recoverable = false; + } + + bio_put(bio); + closure_put(cl); +} + +void bch_cache_read_endio(struct bio *bio, int error) +{ + struct bbio *b = container_of(bio, struct bbio, bio); + struct closure *cl = bio->bi_private; + struct search *s = container_of(cl, struct search, cl); + + /* + * If the bucket was reused while our bio was in flight, we might have + * read the wrong data. Set s->error but not error so it doesn't get + * counted against the cache device, but we'll still reread the data + * from the backing device. + */ + + if (error) + s->error = error; + else if (ptr_stale(s->op.c, &b->key, 0)) { + atomic_long_inc(&s->op.c->cache_read_races); + s->error = -EINTR; + } + + bch_bbio_endio(s->op.c, bio, error, "reading from cache"); +} + +static void bio_complete(struct search *s) +{ + if (s->orig_bio) { + int cpu, rw = bio_data_dir(s->orig_bio); + unsigned long duration = jiffies - s->start_time; + + cpu = part_stat_lock(); + part_round_stats(cpu, &s->d->disk->part0); + part_stat_add(cpu, &s->d->disk->part0, ticks[rw], duration); + part_stat_unlock(); + + trace_bcache_request_end(s, s->orig_bio); + bio_endio(s->orig_bio, s->error); + s->orig_bio = NULL; + } +} + +static void do_bio_hook(struct search *s) +{ + struct bio *bio = &s->bio.bio; + memcpy(bio, s->orig_bio, sizeof(struct bio)); + + bio->bi_end_io = request_endio; + bio->bi_private = &s->cl; + atomic_set(&bio->bi_cnt, 3); +} + +static void search_free(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + bio_complete(s); + + if (s->op.cache_bio) + bio_put(s->op.cache_bio); + + if (s->unaligned_bvec) + mempool_free(s->bio.bio.bi_io_vec, s->d->unaligned_bvec); + + closure_debug_destroy(cl); + mempool_free(s, s->d->c->search); +} + +static struct search *search_alloc(struct bio *bio, struct bcache_device *d) +{ + struct bio_vec *bv; + struct search *s = mempool_alloc(d->c->search, GFP_NOIO); + memset(s, 0, offsetof(struct search, op.keys)); + + __closure_init(&s->cl, NULL); + + s->op.inode = d->id; + s->op.c = d->c; + s->d = d; + s->op.lock = -1; + s->task = current; + s->orig_bio = bio; + s->write = (bio->bi_rw & REQ_WRITE) != 0; + s->op.flush_journal = (bio->bi_rw & REQ_FLUSH) != 0; + s->op.skip = (bio->bi_rw & REQ_DISCARD) != 0; + s->recoverable = 1; + s->start_time = jiffies; + do_bio_hook(s); + + if (bio->bi_size != bio_segments(bio) * PAGE_SIZE) { + bv = mempool_alloc(d->unaligned_bvec, GFP_NOIO); + memcpy(bv, bio_iovec(bio), + sizeof(struct bio_vec) * bio_segments(bio)); + + s->bio.bio.bi_io_vec = bv; + s->unaligned_bvec = 1; + } + + return s; +} + +static void btree_read_async(struct closure *cl) +{ + struct btree_op *op = container_of(cl, struct btree_op, cl); + + int ret = btree_root(search_recurse, op->c, op); + + if (ret == -EAGAIN) + continue_at(cl, btree_read_async, bcache_wq); + + closure_return(cl); +} + +/* Cached devices */ + +static void cached_dev_bio_complete(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + + search_free(cl); + cached_dev_put(dc); +} + +/* Process reads */ + +static void cached_dev_read_complete(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + + if (s->op.insert_collision) + bch_mark_cache_miss_collision(s); + + if (s->op.cache_bio) { + int i; + struct bio_vec *bv; + + __bio_for_each_segment(bv, s->op.cache_bio, i, 0) + __free_page(bv->bv_page); + } + + cached_dev_bio_complete(cl); +} + +static void request_read_error(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + struct bio_vec *bv; + int i; + + if (s->recoverable) { + /* The cache read failed, but we can retry from the backing + * device. + */ + pr_debug("recovering at sector %llu", + (uint64_t) s->orig_bio->bi_sector); + + s->error = 0; + bv = s->bio.bio.bi_io_vec; + do_bio_hook(s); + s->bio.bio.bi_io_vec = bv; + + if (!s->unaligned_bvec) + bio_for_each_segment(bv, s->orig_bio, i) + bv->bv_offset = 0, bv->bv_len = PAGE_SIZE; + else + memcpy(s->bio.bio.bi_io_vec, + bio_iovec(s->orig_bio), + sizeof(struct bio_vec) * + bio_segments(s->orig_bio)); + + /* XXX: invalidate cache */ + + trace_bcache_read_retry(&s->bio.bio); + closure_bio_submit(&s->bio.bio, &s->cl, s->d); + } + + continue_at(cl, cached_dev_read_complete, NULL); +} + +static void request_read_done(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + + /* + * s->cache_bio != NULL implies that we had a cache miss; cache_bio now + * contains data ready to be inserted into the cache. + * + * First, we copy the data we just read from cache_bio's bounce buffers + * to the buffers the original bio pointed to: + */ + + if (s->op.cache_bio) { + struct bio_vec *src, *dst; + unsigned src_offset, dst_offset, bytes; + void *dst_ptr; + + bio_reset(s->op.cache_bio); + s->op.cache_bio->bi_sector = s->cache_miss->bi_sector; + s->op.cache_bio->bi_bdev = s->cache_miss->bi_bdev; + s->op.cache_bio->bi_size = s->cache_bio_sectors << 9; + bch_bio_map(s->op.cache_bio, NULL); + + src = bio_iovec(s->op.cache_bio); + dst = bio_iovec(s->cache_miss); + src_offset = src->bv_offset; + dst_offset = dst->bv_offset; + dst_ptr = kmap(dst->bv_page); + + while (1) { + if (dst_offset == dst->bv_offset + dst->bv_len) { + kunmap(dst->bv_page); + dst++; + if (dst == bio_iovec_idx(s->cache_miss, + s->cache_miss->bi_vcnt)) + break; + + dst_offset = dst->bv_offset; + dst_ptr = kmap(dst->bv_page); + } + + if (src_offset == src->bv_offset + src->bv_len) { + src++; + if (src == bio_iovec_idx(s->op.cache_bio, + s->op.cache_bio->bi_vcnt)) + BUG(); + + src_offset = src->bv_offset; + } + + bytes = min(dst->bv_offset + dst->bv_len - dst_offset, + src->bv_offset + src->bv_len - src_offset); + + memcpy(dst_ptr + dst_offset, + page_address(src->bv_page) + src_offset, + bytes); + + src_offset += bytes; + dst_offset += bytes; + } + + bio_put(s->cache_miss); + s->cache_miss = NULL; + } + + if (verify(dc, &s->bio.bio) && s->recoverable) + bch_data_verify(s); + + bio_complete(s); + + if (s->op.cache_bio && + !test_bit(CACHE_SET_STOPPING, &s->op.c->flags)) { + s->op.type = BTREE_REPLACE; + closure_call(&s->op.cl, bch_insert_data, NULL, cl); + } + + continue_at(cl, cached_dev_read_complete, NULL); +} + +static void request_read_done_bh(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + + bch_mark_cache_accounting(s, !s->cache_miss, s->op.skip); + + if (s->error) + continue_at_nobarrier(cl, request_read_error, bcache_wq); + else if (s->op.cache_bio || verify(dc, &s->bio.bio)) + continue_at_nobarrier(cl, request_read_done, bcache_wq); + else + continue_at_nobarrier(cl, cached_dev_read_complete, NULL); +} + +static int cached_dev_cache_miss(struct btree *b, struct search *s, + struct bio *bio, unsigned sectors) +{ + int ret = 0; + unsigned reada; + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + struct bio *miss; + + miss = bch_bio_split(bio, sectors, GFP_NOIO, s->d->bio_split); + if (!miss) + return -EAGAIN; + + if (miss == bio) + s->op.lookup_done = true; + + miss->bi_end_io = request_endio; + miss->bi_private = &s->cl; + + if (s->cache_miss || s->op.skip) + goto out_submit; + + if (miss != bio || + (bio->bi_rw & REQ_RAHEAD) || + (bio->bi_rw & REQ_META) || + s->op.c->gc_stats.in_use >= CUTOFF_CACHE_READA) + reada = 0; + else { + reada = min(dc->readahead >> 9, + sectors - bio_sectors(miss)); + + if (bio_end(miss) + reada > bdev_sectors(miss->bi_bdev)) + reada = bdev_sectors(miss->bi_bdev) - bio_end(miss); + } + + s->cache_bio_sectors = bio_sectors(miss) + reada; + s->op.cache_bio = bio_alloc_bioset(GFP_NOWAIT, + DIV_ROUND_UP(s->cache_bio_sectors, PAGE_SECTORS), + dc->disk.bio_split); + + if (!s->op.cache_bio) + goto out_submit; + + s->op.cache_bio->bi_sector = miss->bi_sector; + s->op.cache_bio->bi_bdev = miss->bi_bdev; + s->op.cache_bio->bi_size = s->cache_bio_sectors << 9; + + s->op.cache_bio->bi_end_io = request_endio; + s->op.cache_bio->bi_private = &s->cl; + + /* btree_search_recurse()'s btree iterator is no good anymore */ + ret = -EINTR; + if (!bch_btree_insert_check_key(b, &s->op, s->op.cache_bio)) + goto out_put; + + bch_bio_map(s->op.cache_bio, NULL); + if (bch_bio_alloc_pages(s->op.cache_bio, __GFP_NOWARN|GFP_NOIO)) + goto out_put; + + s->cache_miss = miss; + bio_get(s->op.cache_bio); + + trace_bcache_cache_miss(s->orig_bio); + closure_bio_submit(s->op.cache_bio, &s->cl, s->d); + + return ret; +out_put: + bio_put(s->op.cache_bio); + s->op.cache_bio = NULL; +out_submit: + closure_bio_submit(miss, &s->cl, s->d); + return ret; +} + +static void request_read(struct cached_dev *dc, struct search *s) +{ + struct closure *cl = &s->cl; + + check_should_skip(dc, s); + closure_call(&s->op.cl, btree_read_async, NULL, cl); + + continue_at(cl, request_read_done_bh, NULL); +} + +/* Process writes */ + +static void cached_dev_write_complete(struct closure *cl) +{ + struct search *s = container_of(cl, struct search, cl); + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + + up_read_non_owner(&dc->writeback_lock); + cached_dev_bio_complete(cl); +} + +static bool should_writeback(struct cached_dev *dc, struct bio *bio) +{ + unsigned threshold = (bio->bi_rw & REQ_SYNC) + ? CUTOFF_WRITEBACK_SYNC + : CUTOFF_WRITEBACK; + + return !atomic_read(&dc->disk.detaching) && + cache_mode(dc, bio) == CACHE_MODE_WRITEBACK && + dc->disk.c->gc_stats.in_use < threshold; +} + +static void request_write(struct cached_dev *dc, struct search *s) +{ + struct closure *cl = &s->cl; + struct bio *bio = &s->bio.bio; + struct bkey start, end; + start = KEY(dc->disk.id, bio->bi_sector, 0); + end = KEY(dc->disk.id, bio_end(bio), 0); + + bch_keybuf_check_overlapping(&s->op.c->moving_gc_keys, &start, &end); + + check_should_skip(dc, s); + down_read_non_owner(&dc->writeback_lock); + + if (bch_keybuf_check_overlapping(&dc->writeback_keys, &start, &end)) { + s->op.skip = false; + s->writeback = true; + } + + if (bio->bi_rw & REQ_DISCARD) + goto skip; + + if (s->op.skip) + goto skip; + + if (should_writeback(dc, s->orig_bio)) + s->writeback = true; + + if (!s->writeback) { + s->op.cache_bio = bio_clone_bioset(bio, GFP_NOIO, + dc->disk.bio_split); + + trace_bcache_writethrough(s->orig_bio); + closure_bio_submit(bio, cl, s->d); + } else { + s->op.cache_bio = bio; + trace_bcache_writeback(s->orig_bio); + bch_writeback_add(dc, bio_sectors(bio)); + } +out: + closure_call(&s->op.cl, bch_insert_data, NULL, cl); + continue_at(cl, cached_dev_write_complete, NULL); +skip: + s->op.skip = true; + s->op.cache_bio = s->orig_bio; + bio_get(s->op.cache_bio); + trace_bcache_write_skip(s->orig_bio); + + if ((bio->bi_rw & REQ_DISCARD) && + !blk_queue_discard(bdev_get_queue(dc->bdev))) + goto out; + + closure_bio_submit(bio, cl, s->d); + goto out; +} + +static void request_nodata(struct cached_dev *dc, struct search *s) +{ + struct closure *cl = &s->cl; + struct bio *bio = &s->bio.bio; + + if (bio->bi_rw & REQ_DISCARD) { + request_write(dc, s); + return; + } + + if (s->op.flush_journal) + bch_journal_meta(s->op.c, cl); + + closure_bio_submit(bio, cl, s->d); + + continue_at(cl, cached_dev_bio_complete, NULL); +} + +/* Cached devices - read & write stuff */ + +int bch_get_congested(struct cache_set *c) +{ + int i; + + if (!c->congested_read_threshold_us && + !c->congested_write_threshold_us) + return 0; + + i = (local_clock_us() - c->congested_last_us) / 1024; + if (i < 0) + return 0; + + i += atomic_read(&c->congested); + if (i >= 0) + return 0; + + i += CONGESTED_MAX; + + return i <= 0 ? 1 : fract_exp_two(i, 6); +} + +static void add_sequential(struct task_struct *t) +{ + ewma_add(t->sequential_io_avg, + t->sequential_io, 8, 0); + + t->sequential_io = 0; +} + +static struct hlist_head *iohash(struct cached_dev *dc, uint64_t k) +{ + return &dc->io_hash[hash_64(k, RECENT_IO_BITS)]; +} + +static void check_should_skip(struct cached_dev *dc, struct search *s) +{ + struct cache_set *c = s->op.c; + struct bio *bio = &s->bio.bio; + + long rand; + int cutoff = bch_get_congested(c); + unsigned mode = cache_mode(dc, bio); + + if (atomic_read(&dc->disk.detaching) || + c->gc_stats.in_use > CUTOFF_CACHE_ADD || + (bio->bi_rw & REQ_DISCARD)) + goto skip; + + if (mode == CACHE_MODE_NONE || + (mode == CACHE_MODE_WRITEAROUND && + (bio->bi_rw & REQ_WRITE))) + goto skip; + + if (bio->bi_sector & (c->sb.block_size - 1) || + bio_sectors(bio) & (c->sb.block_size - 1)) { + pr_debug("skipping unaligned io"); + goto skip; + } + + if (!cutoff) { + cutoff = dc->sequential_cutoff >> 9; + + if (!cutoff) + goto rescale; + + if (mode == CACHE_MODE_WRITEBACK && + (bio->bi_rw & REQ_WRITE) && + (bio->bi_rw & REQ_SYNC)) + goto rescale; + } + + if (dc->sequential_merge) { + struct io *i; + + spin_lock(&dc->io_lock); + + hlist_for_each_entry(i, iohash(dc, bio->bi_sector), hash) + if (i->last == bio->bi_sector && + time_before(jiffies, i->jiffies)) + goto found; + + i = list_first_entry(&dc->io_lru, struct io, lru); + + add_sequential(s->task); + i->sequential = 0; +found: + if (i->sequential + bio->bi_size > i->sequential) + i->sequential += bio->bi_size; + + i->last = bio_end(bio); + i->jiffies = jiffies + msecs_to_jiffies(5000); + s->task->sequential_io = i->sequential; + + hlist_del(&i->hash); + hlist_add_head(&i->hash, iohash(dc, i->last)); + list_move_tail(&i->lru, &dc->io_lru); + + spin_unlock(&dc->io_lock); + } else { + s->task->sequential_io = bio->bi_size; + + add_sequential(s->task); + } + + rand = get_random_int(); + cutoff -= bitmap_weight(&rand, BITS_PER_LONG); + + if (cutoff <= (int) (max(s->task->sequential_io, + s->task->sequential_io_avg) >> 9)) + goto skip; + +rescale: + bch_rescale_priorities(c, bio_sectors(bio)); + return; +skip: + bch_mark_sectors_bypassed(s, bio_sectors(bio)); + s->op.skip = true; +} + +static void cached_dev_make_request(struct request_queue *q, struct bio *bio) +{ + struct search *s; + struct bcache_device *d = bio->bi_bdev->bd_disk->private_data; + struct cached_dev *dc = container_of(d, struct cached_dev, disk); + int cpu, rw = bio_data_dir(bio); + + cpu = part_stat_lock(); + part_stat_inc(cpu, &d->disk->part0, ios[rw]); + part_stat_add(cpu, &d->disk->part0, sectors[rw], bio_sectors(bio)); + part_stat_unlock(); + + bio->bi_bdev = dc->bdev; + bio->bi_sector += dc->sb.data_offset; + + if (cached_dev_get(dc)) { + s = search_alloc(bio, d); + trace_bcache_request_start(s, bio); + + if (!bio_has_data(bio)) + request_nodata(dc, s); + else if (rw) + request_write(dc, s); + else + request_read(dc, s); + } else { + if ((bio->bi_rw & REQ_DISCARD) && + !blk_queue_discard(bdev_get_queue(dc->bdev))) + bio_endio(bio, 0); + else + bch_generic_make_request(bio, &d->bio_split_hook); + } +} + +static int cached_dev_ioctl(struct bcache_device *d, fmode_t mode, + unsigned int cmd, unsigned long arg) +{ + struct cached_dev *dc = container_of(d, struct cached_dev, disk); + return __blkdev_driver_ioctl(dc->bdev, mode, cmd, arg); +} + +static int cached_dev_congested(void *data, int bits) +{ + struct bcache_device *d = data; + struct cached_dev *dc = container_of(d, struct cached_dev, disk); + struct request_queue *q = bdev_get_queue(dc->bdev); + int ret = 0; + + if (bdi_congested(&q->backing_dev_info, bits)) + return 1; + + if (cached_dev_get(dc)) { + unsigned i; + struct cache *ca; + + for_each_cache(ca, d->c, i) { + q = bdev_get_queue(ca->bdev); + ret |= bdi_congested(&q->backing_dev_info, bits); + } + + cached_dev_put(dc); + } + + return ret; +} + +void bch_cached_dev_request_init(struct cached_dev *dc) +{ + struct gendisk *g = dc->disk.disk; + + g->queue->make_request_fn = cached_dev_make_request; + g->queue->backing_dev_info.congested_fn = cached_dev_congested; + dc->disk.cache_miss = cached_dev_cache_miss; + dc->disk.ioctl = cached_dev_ioctl; +} + +/* Flash backed devices */ + +static int flash_dev_cache_miss(struct btree *b, struct search *s, + struct bio *bio, unsigned sectors) +{ + /* Zero fill bio */ + + while (bio->bi_idx != bio->bi_vcnt) { + struct bio_vec *bv = bio_iovec(bio); + unsigned j = min(bv->bv_len >> 9, sectors); + + void *p = kmap(bv->bv_page); + memset(p + bv->bv_offset, 0, j << 9); + kunmap(bv->bv_page); + + bv->bv_len -= j << 9; + bv->bv_offset += j << 9; + + if (bv->bv_len) + return 0; + + bio->bi_sector += j; + bio->bi_size -= j << 9; + + bio->bi_idx++; + sectors -= j; + } + + s->op.lookup_done = true; + + return 0; +} + +static void flash_dev_make_request(struct request_queue *q, struct bio *bio) +{ + struct search *s; + struct closure *cl; + struct bcache_device *d = bio->bi_bdev->bd_disk->private_data; + int cpu, rw = bio_data_dir(bio); + + cpu = part_stat_lock(); + part_stat_inc(cpu, &d->disk->part0, ios[rw]); + part_stat_add(cpu, &d->disk->part0, sectors[rw], bio_sectors(bio)); + part_stat_unlock(); + + s = search_alloc(bio, d); + cl = &s->cl; + bio = &s->bio.bio; + + trace_bcache_request_start(s, bio); + + if (bio_has_data(bio) && !rw) { + closure_call(&s->op.cl, btree_read_async, NULL, cl); + } else if (bio_has_data(bio) || s->op.skip) { + bch_keybuf_check_overlapping(&s->op.c->moving_gc_keys, + &KEY(d->id, bio->bi_sector, 0), + &KEY(d->id, bio_end(bio), 0)); + + s->writeback = true; + s->op.cache_bio = bio; + + closure_call(&s->op.cl, bch_insert_data, NULL, cl); + } else { + /* No data - probably a cache flush */ + if (s->op.flush_journal) + bch_journal_meta(s->op.c, cl); + } + + continue_at(cl, search_free, NULL); +} + +static int flash_dev_ioctl(struct bcache_device *d, fmode_t mode, + unsigned int cmd, unsigned long arg) +{ + return -ENOTTY; +} + +static int flash_dev_congested(void *data, int bits) +{ + struct bcache_device *d = data; + struct request_queue *q; + struct cache *ca; + unsigned i; + int ret = 0; + + for_each_cache(ca, d->c, i) { + q = bdev_get_queue(ca->bdev); + ret |= bdi_congested(&q->backing_dev_info, bits); + } + + return ret; +} + +void bch_flash_dev_request_init(struct bcache_device *d) +{ + struct gendisk *g = d->disk; + + g->queue->make_request_fn = flash_dev_make_request; + g->queue->backing_dev_info.congested_fn = flash_dev_congested; + d->cache_miss = flash_dev_cache_miss; + d->ioctl = flash_dev_ioctl; +} + +void bch_request_exit(void) +{ +#ifdef CONFIG_CGROUP_BCACHE + cgroup_unload_subsys(&bcache_subsys); +#endif + if (bch_search_cache) + kmem_cache_destroy(bch_search_cache); +} + +int __init bch_request_init(void) +{ + bch_search_cache = KMEM_CACHE(search, 0); + if (!bch_search_cache) + return -ENOMEM; + +#ifdef CONFIG_CGROUP_BCACHE + cgroup_load_subsys(&bcache_subsys); + init_bch_cgroup(&bcache_default_cgroup); + + cgroup_add_cftypes(&bcache_subsys, bch_files); +#endif + return 0; +} diff --git a/drivers/md/bcache/request.h b/drivers/md/bcache/request.h new file mode 100644 index 000000000000..254d9ab5707c --- /dev/null +++ b/drivers/md/bcache/request.h @@ -0,0 +1,62 @@ +#ifndef _BCACHE_REQUEST_H_ +#define _BCACHE_REQUEST_H_ + +#include + +struct search { + /* Stack frame for bio_complete */ + struct closure cl; + + struct bcache_device *d; + struct task_struct *task; + + struct bbio bio; + struct bio *orig_bio; + struct bio *cache_miss; + unsigned cache_bio_sectors; + + unsigned recoverable:1; + unsigned unaligned_bvec:1; + + unsigned write:1; + unsigned writeback:1; + + /* IO error returned to s->bio */ + short error; + unsigned long start_time; + + /* Anything past op->keys won't get zeroed in do_bio_hook */ + struct btree_op op; +}; + +void bch_cache_read_endio(struct bio *, int); +int bch_get_congested(struct cache_set *); +void bch_insert_data(struct closure *cl); +void bch_btree_insert_async(struct closure *); +void bch_cache_read_endio(struct bio *, int); + +void bch_open_buckets_free(struct cache_set *); +int bch_open_buckets_alloc(struct cache_set *); + +void bch_cached_dev_request_init(struct cached_dev *dc); +void bch_flash_dev_request_init(struct bcache_device *d); + +extern struct kmem_cache *bch_search_cache, *bch_passthrough_cache; + +struct bch_cgroup { +#ifdef CONFIG_CGROUP_BCACHE + struct cgroup_subsys_state css; +#endif + /* + * We subtract one from the index into bch_cache_modes[], so that + * default == -1; this makes it so the rest match up with d->cache_mode, + * and we use d->cache_mode if cgrp->cache_mode < 0 + */ + short cache_mode; + bool verify; + struct cache_stat_collector stats; +}; + +struct bch_cgroup *bch_bio_to_cgroup(struct bio *bio); + +#endif /* _BCACHE_REQUEST_H_ */ diff --git a/drivers/md/bcache/stats.c b/drivers/md/bcache/stats.c new file mode 100644 index 000000000000..64e679449c2a --- /dev/null +++ b/drivers/md/bcache/stats.c @@ -0,0 +1,246 @@ +/* + * bcache stats code + * + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "stats.h" +#include "btree.h" +#include "request.h" +#include "sysfs.h" + +/* + * We keep absolute totals of various statistics, and addionally a set of three + * rolling averages. + * + * Every so often, a timer goes off and rescales the rolling averages. + * accounting_rescale[] is how many times the timer has to go off before we + * rescale each set of numbers; that gets us half lives of 5 minutes, one hour, + * and one day. + * + * accounting_delay is how often the timer goes off - 22 times in 5 minutes, + * and accounting_weight is what we use to rescale: + * + * pow(31 / 32, 22) ~= 1/2 + * + * So that we don't have to increment each set of numbers every time we (say) + * get a cache hit, we increment a single atomic_t in acc->collector, and when + * the rescale function runs it resets the atomic counter to 0 and adds its + * old value to each of the exported numbers. + * + * To reduce rounding error, the numbers in struct cache_stats are all + * stored left shifted by 16, and scaled back in the sysfs show() function. + */ + +static const unsigned DAY_RESCALE = 288; +static const unsigned HOUR_RESCALE = 12; +static const unsigned FIVE_MINUTE_RESCALE = 1; +static const unsigned accounting_delay = (HZ * 300) / 22; +static const unsigned accounting_weight = 32; + +/* sysfs reading/writing */ + +read_attribute(cache_hits); +read_attribute(cache_misses); +read_attribute(cache_bypass_hits); +read_attribute(cache_bypass_misses); +read_attribute(cache_hit_ratio); +read_attribute(cache_readaheads); +read_attribute(cache_miss_collisions); +read_attribute(bypassed); + +SHOW(bch_stats) +{ + struct cache_stats *s = + container_of(kobj, struct cache_stats, kobj); +#define var(stat) (s->stat >> 16) + var_print(cache_hits); + var_print(cache_misses); + var_print(cache_bypass_hits); + var_print(cache_bypass_misses); + + sysfs_print(cache_hit_ratio, + DIV_SAFE(var(cache_hits) * 100, + var(cache_hits) + var(cache_misses))); + + var_print(cache_readaheads); + var_print(cache_miss_collisions); + sysfs_hprint(bypassed, var(sectors_bypassed) << 9); +#undef var + return 0; +} + +STORE(bch_stats) +{ + return size; +} + +static void bch_stats_release(struct kobject *k) +{ +} + +static struct attribute *bch_stats_files[] = { + &sysfs_cache_hits, + &sysfs_cache_misses, + &sysfs_cache_bypass_hits, + &sysfs_cache_bypass_misses, + &sysfs_cache_hit_ratio, + &sysfs_cache_readaheads, + &sysfs_cache_miss_collisions, + &sysfs_bypassed, + NULL +}; +static KTYPE(bch_stats); + +static void scale_accounting(unsigned long data); + +void bch_cache_accounting_init(struct cache_accounting *acc, + struct closure *parent) +{ + kobject_init(&acc->total.kobj, &bch_stats_ktype); + kobject_init(&acc->five_minute.kobj, &bch_stats_ktype); + kobject_init(&acc->hour.kobj, &bch_stats_ktype); + kobject_init(&acc->day.kobj, &bch_stats_ktype); + + closure_init(&acc->cl, parent); + init_timer(&acc->timer); + acc->timer.expires = jiffies + accounting_delay; + acc->timer.data = (unsigned long) acc; + acc->timer.function = scale_accounting; + add_timer(&acc->timer); +} + +int bch_cache_accounting_add_kobjs(struct cache_accounting *acc, + struct kobject *parent) +{ + int ret = kobject_add(&acc->total.kobj, parent, + "stats_total"); + ret = ret ?: kobject_add(&acc->five_minute.kobj, parent, + "stats_five_minute"); + ret = ret ?: kobject_add(&acc->hour.kobj, parent, + "stats_hour"); + ret = ret ?: kobject_add(&acc->day.kobj, parent, + "stats_day"); + return ret; +} + +void bch_cache_accounting_clear(struct cache_accounting *acc) +{ + memset(&acc->total.cache_hits, + 0, + sizeof(unsigned long) * 7); +} + +void bch_cache_accounting_destroy(struct cache_accounting *acc) +{ + kobject_put(&acc->total.kobj); + kobject_put(&acc->five_minute.kobj); + kobject_put(&acc->hour.kobj); + kobject_put(&acc->day.kobj); + + atomic_set(&acc->closing, 1); + if (del_timer_sync(&acc->timer)) + closure_return(&acc->cl); +} + +/* EWMA scaling */ + +static void scale_stat(unsigned long *stat) +{ + *stat = ewma_add(*stat, 0, accounting_weight, 0); +} + +static void scale_stats(struct cache_stats *stats, unsigned long rescale_at) +{ + if (++stats->rescale == rescale_at) { + stats->rescale = 0; + scale_stat(&stats->cache_hits); + scale_stat(&stats->cache_misses); + scale_stat(&stats->cache_bypass_hits); + scale_stat(&stats->cache_bypass_misses); + scale_stat(&stats->cache_readaheads); + scale_stat(&stats->cache_miss_collisions); + scale_stat(&stats->sectors_bypassed); + } +} + +static void scale_accounting(unsigned long data) +{ + struct cache_accounting *acc = (struct cache_accounting *) data; + +#define move_stat(name) do { \ + unsigned t = atomic_xchg(&acc->collector.name, 0); \ + t <<= 16; \ + acc->five_minute.name += t; \ + acc->hour.name += t; \ + acc->day.name += t; \ + acc->total.name += t; \ +} while (0) + + move_stat(cache_hits); + move_stat(cache_misses); + move_stat(cache_bypass_hits); + move_stat(cache_bypass_misses); + move_stat(cache_readaheads); + move_stat(cache_miss_collisions); + move_stat(sectors_bypassed); + + scale_stats(&acc->total, 0); + scale_stats(&acc->day, DAY_RESCALE); + scale_stats(&acc->hour, HOUR_RESCALE); + scale_stats(&acc->five_minute, FIVE_MINUTE_RESCALE); + + acc->timer.expires += accounting_delay; + + if (!atomic_read(&acc->closing)) + add_timer(&acc->timer); + else + closure_return(&acc->cl); +} + +static void mark_cache_stats(struct cache_stat_collector *stats, + bool hit, bool bypass) +{ + if (!bypass) + if (hit) + atomic_inc(&stats->cache_hits); + else + atomic_inc(&stats->cache_misses); + else + if (hit) + atomic_inc(&stats->cache_bypass_hits); + else + atomic_inc(&stats->cache_bypass_misses); +} + +void bch_mark_cache_accounting(struct search *s, bool hit, bool bypass) +{ + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + mark_cache_stats(&dc->accounting.collector, hit, bypass); + mark_cache_stats(&s->op.c->accounting.collector, hit, bypass); +#ifdef CONFIG_CGROUP_BCACHE + mark_cache_stats(&(bch_bio_to_cgroup(s->orig_bio)->stats), hit, bypass); +#endif +} + +void bch_mark_cache_readahead(struct search *s) +{ + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + atomic_inc(&dc->accounting.collector.cache_readaheads); + atomic_inc(&s->op.c->accounting.collector.cache_readaheads); +} + +void bch_mark_cache_miss_collision(struct search *s) +{ + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + atomic_inc(&dc->accounting.collector.cache_miss_collisions); + atomic_inc(&s->op.c->accounting.collector.cache_miss_collisions); +} + +void bch_mark_sectors_bypassed(struct search *s, int sectors) +{ + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); + atomic_add(sectors, &dc->accounting.collector.sectors_bypassed); + atomic_add(sectors, &s->op.c->accounting.collector.sectors_bypassed); +} diff --git a/drivers/md/bcache/stats.h b/drivers/md/bcache/stats.h new file mode 100644 index 000000000000..c7c7a8fd29fe --- /dev/null +++ b/drivers/md/bcache/stats.h @@ -0,0 +1,58 @@ +#ifndef _BCACHE_STATS_H_ +#define _BCACHE_STATS_H_ + +struct cache_stat_collector { + atomic_t cache_hits; + atomic_t cache_misses; + atomic_t cache_bypass_hits; + atomic_t cache_bypass_misses; + atomic_t cache_readaheads; + atomic_t cache_miss_collisions; + atomic_t sectors_bypassed; +}; + +struct cache_stats { + struct kobject kobj; + + unsigned long cache_hits; + unsigned long cache_misses; + unsigned long cache_bypass_hits; + unsigned long cache_bypass_misses; + unsigned long cache_readaheads; + unsigned long cache_miss_collisions; + unsigned long sectors_bypassed; + + unsigned rescale; +}; + +struct cache_accounting { + struct closure cl; + struct timer_list timer; + atomic_t closing; + + struct cache_stat_collector collector; + + struct cache_stats total; + struct cache_stats five_minute; + struct cache_stats hour; + struct cache_stats day; +}; + +struct search; + +void bch_cache_accounting_init(struct cache_accounting *acc, + struct closure *parent); + +int bch_cache_accounting_add_kobjs(struct cache_accounting *acc, + struct kobject *parent); + +void bch_cache_accounting_clear(struct cache_accounting *acc); + +void bch_cache_accounting_destroy(struct cache_accounting *acc); + +void bch_mark_cache_accounting(struct search *s, bool hit, bool bypass); +void bch_mark_cache_readahead(struct search *s); +void bch_mark_cache_miss_collision(struct search *s); +void bch_mark_sectors_bypassed(struct search *s, int sectors); + +#endif /* _BCACHE_STATS_H_ */ diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c new file mode 100644 index 000000000000..c8046bc4aa57 --- /dev/null +++ b/drivers/md/bcache/super.c @@ -0,0 +1,1987 @@ +/* + * bcache setup/teardown code, and some metadata io - read a superblock and + * figure out what to do with it. + * + * Copyright 2010, 2011 Kent Overstreet + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "btree.h" +#include "debug.h" +#include "request.h" + +#include +#include +#include +#include +#include +#include +#include + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Kent Overstreet "); + +static const char bcache_magic[] = { + 0xc6, 0x85, 0x73, 0xf6, 0x4e, 0x1a, 0x45, 0xca, + 0x82, 0x65, 0xf5, 0x7f, 0x48, 0xba, 0x6d, 0x81 +}; + +static const char invalid_uuid[] = { + 0xa0, 0x3e, 0xf8, 0xed, 0x3e, 0xe1, 0xb8, 0x78, + 0xc8, 0x50, 0xfc, 0x5e, 0xcb, 0x16, 0xcd, 0x99 +}; + +/* Default is -1; we skip past it for struct cached_dev's cache mode */ +const char * const bch_cache_modes[] = { + "default", + "writethrough", + "writeback", + "writearound", + "none", + NULL +}; + +struct uuid_entry_v0 { + uint8_t uuid[16]; + uint8_t label[32]; + uint32_t first_reg; + uint32_t last_reg; + uint32_t invalidated; + uint32_t pad; +}; + +static struct kobject *bcache_kobj; +struct mutex bch_register_lock; +LIST_HEAD(bch_cache_sets); +static LIST_HEAD(uncached_devices); + +static int bcache_major, bcache_minor; +static wait_queue_head_t unregister_wait; +struct workqueue_struct *bcache_wq; + +#define BTREE_MAX_PAGES (256 * 1024 / PAGE_SIZE) + +static void bio_split_pool_free(struct bio_split_pool *p) +{ + if (p->bio_split_hook) + mempool_destroy(p->bio_split_hook); + + if (p->bio_split) + bioset_free(p->bio_split); +} + +static int bio_split_pool_init(struct bio_split_pool *p) +{ + p->bio_split = bioset_create(4, 0); + if (!p->bio_split) + return -ENOMEM; + + p->bio_split_hook = mempool_create_kmalloc_pool(4, + sizeof(struct bio_split_hook)); + if (!p->bio_split_hook) + return -ENOMEM; + + return 0; +} + +/* Superblock */ + +static const char *read_super(struct cache_sb *sb, struct block_device *bdev, + struct page **res) +{ + const char *err; + struct cache_sb *s; + struct buffer_head *bh = __bread(bdev, 1, SB_SIZE); + unsigned i; + + if (!bh) + return "IO error"; + + s = (struct cache_sb *) bh->b_data; + + sb->offset = le64_to_cpu(s->offset); + sb->version = le64_to_cpu(s->version); + + memcpy(sb->magic, s->magic, 16); + memcpy(sb->uuid, s->uuid, 16); + memcpy(sb->set_uuid, s->set_uuid, 16); + memcpy(sb->label, s->label, SB_LABEL_SIZE); + + sb->flags = le64_to_cpu(s->flags); + sb->seq = le64_to_cpu(s->seq); + sb->last_mount = le32_to_cpu(s->last_mount); + sb->first_bucket = le16_to_cpu(s->first_bucket); + sb->keys = le16_to_cpu(s->keys); + + for (i = 0; i < SB_JOURNAL_BUCKETS; i++) + sb->d[i] = le64_to_cpu(s->d[i]); + + pr_debug("read sb version %llu, flags %llu, seq %llu, journal size %u", + sb->version, sb->flags, sb->seq, sb->keys); + + err = "Not a bcache superblock"; + if (sb->offset != SB_SECTOR) + goto err; + + if (memcmp(sb->magic, bcache_magic, 16)) + goto err; + + err = "Too many journal buckets"; + if (sb->keys > SB_JOURNAL_BUCKETS) + goto err; + + err = "Bad checksum"; + if (s->csum != csum_set(s)) + goto err; + + err = "Bad UUID"; + if (bch_is_zero(sb->uuid, 16)) + goto err; + + sb->block_size = le16_to_cpu(s->block_size); + + err = "Superblock block size smaller than device block size"; + if (sb->block_size << 9 < bdev_logical_block_size(bdev)) + goto err; + + switch (sb->version) { + case BCACHE_SB_VERSION_BDEV: + sb->data_offset = BDEV_DATA_START_DEFAULT; + break; + case BCACHE_SB_VERSION_BDEV_WITH_OFFSET: + sb->data_offset = le64_to_cpu(s->data_offset); + + err = "Bad data offset"; + if (sb->data_offset < BDEV_DATA_START_DEFAULT) + goto err; + + break; + case BCACHE_SB_VERSION_CDEV: + case BCACHE_SB_VERSION_CDEV_WITH_UUID: + sb->nbuckets = le64_to_cpu(s->nbuckets); + sb->block_size = le16_to_cpu(s->block_size); + sb->bucket_size = le16_to_cpu(s->bucket_size); + + sb->nr_in_set = le16_to_cpu(s->nr_in_set); + sb->nr_this_dev = le16_to_cpu(s->nr_this_dev); + + err = "Too many buckets"; + if (sb->nbuckets > LONG_MAX) + goto err; + + err = "Not enough buckets"; + if (sb->nbuckets < 1 << 7) + goto err; + + err = "Bad block/bucket size"; + if (!is_power_of_2(sb->block_size) || + sb->block_size > PAGE_SECTORS || + !is_power_of_2(sb->bucket_size) || + sb->bucket_size < PAGE_SECTORS) + goto err; + + err = "Invalid superblock: device too small"; + if (get_capacity(bdev->bd_disk) < sb->bucket_size * sb->nbuckets) + goto err; + + err = "Bad UUID"; + if (bch_is_zero(sb->set_uuid, 16)) + goto err; + + err = "Bad cache device number in set"; + if (!sb->nr_in_set || + sb->nr_in_set <= sb->nr_this_dev || + sb->nr_in_set > MAX_CACHES_PER_SET) + goto err; + + err = "Journal buckets not sequential"; + for (i = 0; i < sb->keys; i++) + if (sb->d[i] != sb->first_bucket + i) + goto err; + + err = "Too many journal buckets"; + if (sb->first_bucket + sb->keys > sb->nbuckets) + goto err; + + err = "Invalid superblock: first bucket comes before end of super"; + if (sb->first_bucket * sb->bucket_size < 16) + goto err; + + break; + default: + err = "Unsupported superblock version"; + goto err; + } + + sb->last_mount = get_seconds(); + err = NULL; + + get_page(bh->b_page); + *res = bh->b_page; +err: + put_bh(bh); + return err; +} + +static void write_bdev_super_endio(struct bio *bio, int error) +{ + struct cached_dev *dc = bio->bi_private; + /* XXX: error checking */ + + closure_put(&dc->sb_write.cl); +} + +static void __write_super(struct cache_sb *sb, struct bio *bio) +{ + struct cache_sb *out = page_address(bio->bi_io_vec[0].bv_page); + unsigned i; + + bio->bi_sector = SB_SECTOR; + bio->bi_rw = REQ_SYNC|REQ_META; + bio->bi_size = SB_SIZE; + bch_bio_map(bio, NULL); + + out->offset = cpu_to_le64(sb->offset); + out->version = cpu_to_le64(sb->version); + + memcpy(out->uuid, sb->uuid, 16); + memcpy(out->set_uuid, sb->set_uuid, 16); + memcpy(out->label, sb->label, SB_LABEL_SIZE); + + out->flags = cpu_to_le64(sb->flags); + out->seq = cpu_to_le64(sb->seq); + + out->last_mount = cpu_to_le32(sb->last_mount); + out->first_bucket = cpu_to_le16(sb->first_bucket); + out->keys = cpu_to_le16(sb->keys); + + for (i = 0; i < sb->keys; i++) + out->d[i] = cpu_to_le64(sb->d[i]); + + out->csum = csum_set(out); + + pr_debug("ver %llu, flags %llu, seq %llu", + sb->version, sb->flags, sb->seq); + + submit_bio(REQ_WRITE, bio); +} + +void bch_write_bdev_super(struct cached_dev *dc, struct closure *parent) +{ + struct closure *cl = &dc->sb_write.cl; + struct bio *bio = &dc->sb_bio; + + closure_lock(&dc->sb_write, parent); + + bio_reset(bio); + bio->bi_bdev = dc->bdev; + bio->bi_end_io = write_bdev_super_endio; + bio->bi_private = dc; + + closure_get(cl); + __write_super(&dc->sb, bio); + + closure_return(cl); +} + +static void write_super_endio(struct bio *bio, int error) +{ + struct cache *ca = bio->bi_private; + + bch_count_io_errors(ca, error, "writing superblock"); + closure_put(&ca->set->sb_write.cl); +} + +void bcache_write_super(struct cache_set *c) +{ + struct closure *cl = &c->sb_write.cl; + struct cache *ca; + unsigned i; + + closure_lock(&c->sb_write, &c->cl); + + c->sb.seq++; + + for_each_cache(ca, c, i) { + struct bio *bio = &ca->sb_bio; + + ca->sb.version = BCACHE_SB_VERSION_CDEV_WITH_UUID; + ca->sb.seq = c->sb.seq; + ca->sb.last_mount = c->sb.last_mount; + + SET_CACHE_SYNC(&ca->sb, CACHE_SYNC(&c->sb)); + + bio_reset(bio); + bio->bi_bdev = ca->bdev; + bio->bi_end_io = write_super_endio; + bio->bi_private = ca; + + closure_get(cl); + __write_super(&ca->sb, bio); + } + + closure_return(cl); +} + +/* UUID io */ + +static void uuid_endio(struct bio *bio, int error) +{ + struct closure *cl = bio->bi_private; + struct cache_set *c = container_of(cl, struct cache_set, uuid_write.cl); + + cache_set_err_on(error, c, "accessing uuids"); + bch_bbio_free(bio, c); + closure_put(cl); +} + +static void uuid_io(struct cache_set *c, unsigned long rw, + struct bkey *k, struct closure *parent) +{ + struct closure *cl = &c->uuid_write.cl; + struct uuid_entry *u; + unsigned i; + + BUG_ON(!parent); + closure_lock(&c->uuid_write, parent); + + for (i = 0; i < KEY_PTRS(k); i++) { + struct bio *bio = bch_bbio_alloc(c); + + bio->bi_rw = REQ_SYNC|REQ_META|rw; + bio->bi_size = KEY_SIZE(k) << 9; + + bio->bi_end_io = uuid_endio; + bio->bi_private = cl; + bch_bio_map(bio, c->uuids); + + bch_submit_bbio(bio, c, k, i); + + if (!(rw & WRITE)) + break; + } + + pr_debug("%s UUIDs at %s", rw & REQ_WRITE ? "wrote" : "read", + pkey(&c->uuid_bucket)); + + for (u = c->uuids; u < c->uuids + c->nr_uuids; u++) + if (!bch_is_zero(u->uuid, 16)) + pr_debug("Slot %zi: %pU: %s: 1st: %u last: %u inv: %u", + u - c->uuids, u->uuid, u->label, + u->first_reg, u->last_reg, u->invalidated); + + closure_return(cl); +} + +static char *uuid_read(struct cache_set *c, struct jset *j, struct closure *cl) +{ + struct bkey *k = &j->uuid_bucket; + + if (__bch_ptr_invalid(c, 1, k)) + return "bad uuid pointer"; + + bkey_copy(&c->uuid_bucket, k); + uuid_io(c, READ_SYNC, k, cl); + + if (j->version < BCACHE_JSET_VERSION_UUIDv1) { + struct uuid_entry_v0 *u0 = (void *) c->uuids; + struct uuid_entry *u1 = (void *) c->uuids; + int i; + + closure_sync(cl); + + /* + * Since the new uuid entry is bigger than the old, we have to + * convert starting at the highest memory address and work down + * in order to do it in place + */ + + for (i = c->nr_uuids - 1; + i >= 0; + --i) { + memcpy(u1[i].uuid, u0[i].uuid, 16); + memcpy(u1[i].label, u0[i].label, 32); + + u1[i].first_reg = u0[i].first_reg; + u1[i].last_reg = u0[i].last_reg; + u1[i].invalidated = u0[i].invalidated; + + u1[i].flags = 0; + u1[i].sectors = 0; + } + } + + return NULL; +} + +static int __uuid_write(struct cache_set *c) +{ + BKEY_PADDED(key) k; + struct closure cl; + closure_init_stack(&cl); + + lockdep_assert_held(&bch_register_lock); + + if (bch_bucket_alloc_set(c, WATERMARK_METADATA, &k.key, 1, &cl)) + return 1; + + SET_KEY_SIZE(&k.key, c->sb.bucket_size); + uuid_io(c, REQ_WRITE, &k.key, &cl); + closure_sync(&cl); + + bkey_copy(&c->uuid_bucket, &k.key); + __bkey_put(c, &k.key); + return 0; +} + +int bch_uuid_write(struct cache_set *c) +{ + int ret = __uuid_write(c); + + if (!ret) + bch_journal_meta(c, NULL); + + return ret; +} + +static struct uuid_entry *uuid_find(struct cache_set *c, const char *uuid) +{ + struct uuid_entry *u; + + for (u = c->uuids; + u < c->uuids + c->nr_uuids; u++) + if (!memcmp(u->uuid, uuid, 16)) + return u; + + return NULL; +} + +static struct uuid_entry *uuid_find_empty(struct cache_set *c) +{ + static const char zero_uuid[16] = "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"; + return uuid_find(c, zero_uuid); +} + +/* + * Bucket priorities/gens: + * + * For each bucket, we store on disk its + * 8 bit gen + * 16 bit priority + * + * See alloc.c for an explanation of the gen. The priority is used to implement + * lru (and in the future other) cache replacement policies; for most purposes + * it's just an opaque integer. + * + * The gens and the priorities don't have a whole lot to do with each other, and + * it's actually the gens that must be written out at specific times - it's no + * big deal if the priorities don't get written, if we lose them we just reuse + * buckets in suboptimal order. + * + * On disk they're stored in a packed array, and in as many buckets are required + * to fit them all. The buckets we use to store them form a list; the journal + * header points to the first bucket, the first bucket points to the second + * bucket, et cetera. + * + * This code is used by the allocation code; periodically (whenever it runs out + * of buckets to allocate from) the allocation code will invalidate some + * buckets, but it can't use those buckets until their new gens are safely on + * disk. + */ + +static void prio_endio(struct bio *bio, int error) +{ + struct cache *ca = bio->bi_private; + + cache_set_err_on(error, ca->set, "accessing priorities"); + bch_bbio_free(bio, ca->set); + closure_put(&ca->prio); +} + +static void prio_io(struct cache *ca, uint64_t bucket, unsigned long rw) +{ + struct closure *cl = &ca->prio; + struct bio *bio = bch_bbio_alloc(ca->set); + + closure_init_stack(cl); + + bio->bi_sector = bucket * ca->sb.bucket_size; + bio->bi_bdev = ca->bdev; + bio->bi_rw = REQ_SYNC|REQ_META|rw; + bio->bi_size = bucket_bytes(ca); + + bio->bi_end_io = prio_endio; + bio->bi_private = ca; + bch_bio_map(bio, ca->disk_buckets); + + closure_bio_submit(bio, &ca->prio, ca); + closure_sync(cl); +} + +#define buckets_free(c) "free %zu, free_inc %zu, unused %zu", \ + fifo_used(&c->free), fifo_used(&c->free_inc), fifo_used(&c->unused) + +void bch_prio_write(struct cache *ca) +{ + int i; + struct bucket *b; + struct closure cl; + + closure_init_stack(&cl); + + lockdep_assert_held(&ca->set->bucket_lock); + + for (b = ca->buckets; + b < ca->buckets + ca->sb.nbuckets; b++) + b->disk_gen = b->gen; + + ca->disk_buckets->seq++; + + atomic_long_add(ca->sb.bucket_size * prio_buckets(ca), + &ca->meta_sectors_written); + + pr_debug("free %zu, free_inc %zu, unused %zu", fifo_used(&ca->free), + fifo_used(&ca->free_inc), fifo_used(&ca->unused)); + blktrace_msg(ca, "Starting priorities: " buckets_free(ca)); + + for (i = prio_buckets(ca) - 1; i >= 0; --i) { + long bucket; + struct prio_set *p = ca->disk_buckets; + struct bucket_disk *d = p->data; + struct bucket_disk *end = d + prios_per_bucket(ca); + + for (b = ca->buckets + i * prios_per_bucket(ca); + b < ca->buckets + ca->sb.nbuckets && d < end; + b++, d++) { + d->prio = cpu_to_le16(b->prio); + d->gen = b->gen; + } + + p->next_bucket = ca->prio_buckets[i + 1]; + p->magic = pset_magic(ca); + p->csum = bch_crc64(&p->magic, bucket_bytes(ca) - 8); + + bucket = bch_bucket_alloc(ca, WATERMARK_PRIO, &cl); + BUG_ON(bucket == -1); + + mutex_unlock(&ca->set->bucket_lock); + prio_io(ca, bucket, REQ_WRITE); + mutex_lock(&ca->set->bucket_lock); + + ca->prio_buckets[i] = bucket; + atomic_dec_bug(&ca->buckets[bucket].pin); + } + + mutex_unlock(&ca->set->bucket_lock); + + bch_journal_meta(ca->set, &cl); + closure_sync(&cl); + + mutex_lock(&ca->set->bucket_lock); + + ca->need_save_prio = 0; + + /* + * Don't want the old priorities to get garbage collected until after we + * finish writing the new ones, and they're journalled + */ + for (i = 0; i < prio_buckets(ca); i++) + ca->prio_last_buckets[i] = ca->prio_buckets[i]; +} + +static void prio_read(struct cache *ca, uint64_t bucket) +{ + struct prio_set *p = ca->disk_buckets; + struct bucket_disk *d = p->data + prios_per_bucket(ca), *end = d; + struct bucket *b; + unsigned bucket_nr = 0; + + for (b = ca->buckets; + b < ca->buckets + ca->sb.nbuckets; + b++, d++) { + if (d == end) { + ca->prio_buckets[bucket_nr] = bucket; + ca->prio_last_buckets[bucket_nr] = bucket; + bucket_nr++; + + prio_io(ca, bucket, READ_SYNC); + + if (p->csum != bch_crc64(&p->magic, bucket_bytes(ca) - 8)) + pr_warn("bad csum reading priorities"); + + if (p->magic != pset_magic(ca)) + pr_warn("bad magic reading priorities"); + + bucket = p->next_bucket; + d = p->data; + } + + b->prio = le16_to_cpu(d->prio); + b->gen = b->disk_gen = b->last_gc = b->gc_gen = d->gen; + } +} + +/* Bcache device */ + +static int open_dev(struct block_device *b, fmode_t mode) +{ + struct bcache_device *d = b->bd_disk->private_data; + if (atomic_read(&d->closing)) + return -ENXIO; + + closure_get(&d->cl); + return 0; +} + +static int release_dev(struct gendisk *b, fmode_t mode) +{ + struct bcache_device *d = b->private_data; + closure_put(&d->cl); + return 0; +} + +static int ioctl_dev(struct block_device *b, fmode_t mode, + unsigned int cmd, unsigned long arg) +{ + struct bcache_device *d = b->bd_disk->private_data; + return d->ioctl(d, mode, cmd, arg); +} + +static const struct block_device_operations bcache_ops = { + .open = open_dev, + .release = release_dev, + .ioctl = ioctl_dev, + .owner = THIS_MODULE, +}; + +void bcache_device_stop(struct bcache_device *d) +{ + if (!atomic_xchg(&d->closing, 1)) + closure_queue(&d->cl); +} + +static void bcache_device_unlink(struct bcache_device *d) +{ + unsigned i; + struct cache *ca; + + sysfs_remove_link(&d->c->kobj, d->name); + sysfs_remove_link(&d->kobj, "cache"); + + for_each_cache(ca, d->c, i) + bd_unlink_disk_holder(ca->bdev, d->disk); +} + +static void bcache_device_link(struct bcache_device *d, struct cache_set *c, + const char *name) +{ + unsigned i; + struct cache *ca; + + for_each_cache(ca, d->c, i) + bd_link_disk_holder(ca->bdev, d->disk); + + snprintf(d->name, BCACHEDEVNAME_SIZE, + "%s%u", name, d->id); + + WARN(sysfs_create_link(&d->kobj, &c->kobj, "cache") || + sysfs_create_link(&c->kobj, &d->kobj, d->name), + "Couldn't create device <-> cache set symlinks"); +} + +static void bcache_device_detach(struct bcache_device *d) +{ + lockdep_assert_held(&bch_register_lock); + + if (atomic_read(&d->detaching)) { + struct uuid_entry *u = d->c->uuids + d->id; + + SET_UUID_FLASH_ONLY(u, 0); + memcpy(u->uuid, invalid_uuid, 16); + u->invalidated = cpu_to_le32(get_seconds()); + bch_uuid_write(d->c); + + atomic_set(&d->detaching, 0); + } + + bcache_device_unlink(d); + + d->c->devices[d->id] = NULL; + closure_put(&d->c->caching); + d->c = NULL; +} + +static void bcache_device_attach(struct bcache_device *d, struct cache_set *c, + unsigned id) +{ + BUG_ON(test_bit(CACHE_SET_STOPPING, &c->flags)); + + d->id = id; + d->c = c; + c->devices[id] = d; + + closure_get(&c->caching); +} + +static void bcache_device_free(struct bcache_device *d) +{ + lockdep_assert_held(&bch_register_lock); + + pr_info("%s stopped", d->disk->disk_name); + + if (d->c) + bcache_device_detach(d); + + if (d->disk) + del_gendisk(d->disk); + if (d->disk && d->disk->queue) + blk_cleanup_queue(d->disk->queue); + if (d->disk) + put_disk(d->disk); + + bio_split_pool_free(&d->bio_split_hook); + if (d->unaligned_bvec) + mempool_destroy(d->unaligned_bvec); + if (d->bio_split) + bioset_free(d->bio_split); + + closure_debug_destroy(&d->cl); +} + +static int bcache_device_init(struct bcache_device *d, unsigned block_size) +{ + struct request_queue *q; + + if (!(d->bio_split = bioset_create(4, offsetof(struct bbio, bio))) || + !(d->unaligned_bvec = mempool_create_kmalloc_pool(1, + sizeof(struct bio_vec) * BIO_MAX_PAGES)) || + bio_split_pool_init(&d->bio_split_hook)) + + return -ENOMEM; + + d->disk = alloc_disk(1); + if (!d->disk) + return -ENOMEM; + + snprintf(d->disk->disk_name, DISK_NAME_LEN, "bcache%i", bcache_minor); + + d->disk->major = bcache_major; + d->disk->first_minor = bcache_minor++; + d->disk->fops = &bcache_ops; + d->disk->private_data = d; + + q = blk_alloc_queue(GFP_KERNEL); + if (!q) + return -ENOMEM; + + blk_queue_make_request(q, NULL); + d->disk->queue = q; + q->queuedata = d; + q->backing_dev_info.congested_data = d; + q->limits.max_hw_sectors = UINT_MAX; + q->limits.max_sectors = UINT_MAX; + q->limits.max_segment_size = UINT_MAX; + q->limits.max_segments = BIO_MAX_PAGES; + q->limits.max_discard_sectors = UINT_MAX; + q->limits.io_min = block_size; + q->limits.logical_block_size = block_size; + q->limits.physical_block_size = block_size; + set_bit(QUEUE_FLAG_NONROT, &d->disk->queue->queue_flags); + set_bit(QUEUE_FLAG_DISCARD, &d->disk->queue->queue_flags); + + return 0; +} + +/* Cached device */ + +static void calc_cached_dev_sectors(struct cache_set *c) +{ + uint64_t sectors = 0; + struct cached_dev *dc; + + list_for_each_entry(dc, &c->cached_devs, list) + sectors += bdev_sectors(dc->bdev); + + c->cached_dev_sectors = sectors; +} + +void bch_cached_dev_run(struct cached_dev *dc) +{ + struct bcache_device *d = &dc->disk; + + if (atomic_xchg(&dc->running, 1)) + return; + + if (!d->c && + BDEV_STATE(&dc->sb) != BDEV_STATE_NONE) { + struct closure cl; + closure_init_stack(&cl); + + SET_BDEV_STATE(&dc->sb, BDEV_STATE_STALE); + bch_write_bdev_super(dc, &cl); + closure_sync(&cl); + } + + add_disk(d->disk); + bd_link_disk_holder(dc->bdev, dc->disk.disk); +#if 0 + char *env[] = { "SYMLINK=label" , NULL }; + kobject_uevent_env(&disk_to_dev(d->disk)->kobj, KOBJ_CHANGE, env); +#endif + if (sysfs_create_link(&d->kobj, &disk_to_dev(d->disk)->kobj, "dev") || + sysfs_create_link(&disk_to_dev(d->disk)->kobj, &d->kobj, "bcache")) + pr_debug("error creating sysfs link"); +} + +static void cached_dev_detach_finish(struct work_struct *w) +{ + struct cached_dev *dc = container_of(w, struct cached_dev, detach); + char buf[BDEVNAME_SIZE]; + struct closure cl; + closure_init_stack(&cl); + + BUG_ON(!atomic_read(&dc->disk.detaching)); + BUG_ON(atomic_read(&dc->count)); + + mutex_lock(&bch_register_lock); + + memset(&dc->sb.set_uuid, 0, 16); + SET_BDEV_STATE(&dc->sb, BDEV_STATE_NONE); + + bch_write_bdev_super(dc, &cl); + closure_sync(&cl); + + bcache_device_detach(&dc->disk); + list_move(&dc->list, &uncached_devices); + + mutex_unlock(&bch_register_lock); + + pr_info("Caching disabled for %s", bdevname(dc->bdev, buf)); + + /* Drop ref we took in cached_dev_detach() */ + closure_put(&dc->disk.cl); +} + +void bch_cached_dev_detach(struct cached_dev *dc) +{ + lockdep_assert_held(&bch_register_lock); + + if (atomic_read(&dc->disk.closing)) + return; + + if (atomic_xchg(&dc->disk.detaching, 1)) + return; + + /* + * Block the device from being closed and freed until we're finished + * detaching + */ + closure_get(&dc->disk.cl); + + bch_writeback_queue(dc); + cached_dev_put(dc); +} + +int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c) +{ + uint32_t rtime = cpu_to_le32(get_seconds()); + struct uuid_entry *u; + char buf[BDEVNAME_SIZE]; + + bdevname(dc->bdev, buf); + + if (memcmp(dc->sb.set_uuid, c->sb.set_uuid, 16)) + return -ENOENT; + + if (dc->disk.c) { + pr_err("Can't attach %s: already attached", buf); + return -EINVAL; + } + + if (test_bit(CACHE_SET_STOPPING, &c->flags)) { + pr_err("Can't attach %s: shutting down", buf); + return -EINVAL; + } + + if (dc->sb.block_size < c->sb.block_size) { + /* Will die */ + pr_err("Couldn't attach %s: block size less than set's block size", + buf); + return -EINVAL; + } + + u = uuid_find(c, dc->sb.uuid); + + if (u && + (BDEV_STATE(&dc->sb) == BDEV_STATE_STALE || + BDEV_STATE(&dc->sb) == BDEV_STATE_NONE)) { + memcpy(u->uuid, invalid_uuid, 16); + u->invalidated = cpu_to_le32(get_seconds()); + u = NULL; + } + + if (!u) { + if (BDEV_STATE(&dc->sb) == BDEV_STATE_DIRTY) { + pr_err("Couldn't find uuid for %s in set", buf); + return -ENOENT; + } + + u = uuid_find_empty(c); + if (!u) { + pr_err("Not caching %s, no room for UUID", buf); + return -EINVAL; + } + } + + /* Deadlocks since we're called via sysfs... + sysfs_remove_file(&dc->kobj, &sysfs_attach); + */ + + if (bch_is_zero(u->uuid, 16)) { + struct closure cl; + closure_init_stack(&cl); + + memcpy(u->uuid, dc->sb.uuid, 16); + memcpy(u->label, dc->sb.label, SB_LABEL_SIZE); + u->first_reg = u->last_reg = rtime; + bch_uuid_write(c); + + memcpy(dc->sb.set_uuid, c->sb.set_uuid, 16); + SET_BDEV_STATE(&dc->sb, BDEV_STATE_CLEAN); + + bch_write_bdev_super(dc, &cl); + closure_sync(&cl); + } else { + u->last_reg = rtime; + bch_uuid_write(c); + } + + bcache_device_attach(&dc->disk, c, u - c->uuids); + list_move(&dc->list, &c->cached_devs); + calc_cached_dev_sectors(c); + + smp_wmb(); + /* + * dc->c must be set before dc->count != 0 - paired with the mb in + * cached_dev_get() + */ + atomic_set(&dc->count, 1); + + if (BDEV_STATE(&dc->sb) == BDEV_STATE_DIRTY) { + atomic_set(&dc->has_dirty, 1); + atomic_inc(&dc->count); + bch_writeback_queue(dc); + } + + bch_cached_dev_run(dc); + bcache_device_link(&dc->disk, c, "bdev"); + + pr_info("Caching %s as %s on set %pU", + bdevname(dc->bdev, buf), dc->disk.disk->disk_name, + dc->disk.c->sb.set_uuid); + return 0; +} + +void bch_cached_dev_release(struct kobject *kobj) +{ + struct cached_dev *dc = container_of(kobj, struct cached_dev, + disk.kobj); + kfree(dc); + module_put(THIS_MODULE); +} + +static void cached_dev_free(struct closure *cl) +{ + struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl); + + cancel_delayed_work_sync(&dc->writeback_rate_update); + + mutex_lock(&bch_register_lock); + + bd_unlink_disk_holder(dc->bdev, dc->disk.disk); + bcache_device_free(&dc->disk); + list_del(&dc->list); + + mutex_unlock(&bch_register_lock); + + if (!IS_ERR_OR_NULL(dc->bdev)) { + blk_sync_queue(bdev_get_queue(dc->bdev)); + blkdev_put(dc->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL); + } + + wake_up(&unregister_wait); + + kobject_put(&dc->disk.kobj); +} + +static void cached_dev_flush(struct closure *cl) +{ + struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl); + struct bcache_device *d = &dc->disk; + + bch_cache_accounting_destroy(&dc->accounting); + kobject_del(&d->kobj); + + continue_at(cl, cached_dev_free, system_wq); +} + +static int cached_dev_init(struct cached_dev *dc, unsigned block_size) +{ + int err; + struct io *io; + + closure_init(&dc->disk.cl, NULL); + set_closure_fn(&dc->disk.cl, cached_dev_flush, system_wq); + + __module_get(THIS_MODULE); + INIT_LIST_HEAD(&dc->list); + kobject_init(&dc->disk.kobj, &bch_cached_dev_ktype); + + bch_cache_accounting_init(&dc->accounting, &dc->disk.cl); + + err = bcache_device_init(&dc->disk, block_size); + if (err) + goto err; + + spin_lock_init(&dc->io_lock); + closure_init_unlocked(&dc->sb_write); + INIT_WORK(&dc->detach, cached_dev_detach_finish); + + dc->sequential_merge = true; + dc->sequential_cutoff = 4 << 20; + + INIT_LIST_HEAD(&dc->io_lru); + dc->sb_bio.bi_max_vecs = 1; + dc->sb_bio.bi_io_vec = dc->sb_bio.bi_inline_vecs; + + for (io = dc->io; io < dc->io + RECENT_IO; io++) { + list_add(&io->lru, &dc->io_lru); + hlist_add_head(&io->hash, dc->io_hash + RECENT_IO); + } + + bch_writeback_init_cached_dev(dc); + return 0; +err: + bcache_device_stop(&dc->disk); + return err; +} + +/* Cached device - bcache superblock */ + +static const char *register_bdev(struct cache_sb *sb, struct page *sb_page, + struct block_device *bdev, + struct cached_dev *dc) +{ + char name[BDEVNAME_SIZE]; + const char *err = "cannot allocate memory"; + struct gendisk *g; + struct cache_set *c; + + if (!dc || cached_dev_init(dc, sb->block_size << 9) != 0) + return err; + + memcpy(&dc->sb, sb, sizeof(struct cache_sb)); + dc->sb_bio.bi_io_vec[0].bv_page = sb_page; + dc->bdev = bdev; + dc->bdev->bd_holder = dc; + + g = dc->disk.disk; + + set_capacity(g, dc->bdev->bd_part->nr_sects - dc->sb.data_offset); + + g->queue->backing_dev_info.ra_pages = + max(g->queue->backing_dev_info.ra_pages, + bdev->bd_queue->backing_dev_info.ra_pages); + + bch_cached_dev_request_init(dc); + + err = "error creating kobject"; + if (kobject_add(&dc->disk.kobj, &part_to_dev(bdev->bd_part)->kobj, + "bcache")) + goto err; + if (bch_cache_accounting_add_kobjs(&dc->accounting, &dc->disk.kobj)) + goto err; + + list_add(&dc->list, &uncached_devices); + list_for_each_entry(c, &bch_cache_sets, list) + bch_cached_dev_attach(dc, c); + + if (BDEV_STATE(&dc->sb) == BDEV_STATE_NONE || + BDEV_STATE(&dc->sb) == BDEV_STATE_STALE) + bch_cached_dev_run(dc); + + return NULL; +err: + kobject_put(&dc->disk.kobj); + pr_notice("error opening %s: %s", bdevname(bdev, name), err); + /* + * Return NULL instead of an error because kobject_put() cleans + * everything up + */ + return NULL; +} + +/* Flash only volumes */ + +void bch_flash_dev_release(struct kobject *kobj) +{ + struct bcache_device *d = container_of(kobj, struct bcache_device, + kobj); + kfree(d); +} + +static void flash_dev_free(struct closure *cl) +{ + struct bcache_device *d = container_of(cl, struct bcache_device, cl); + bcache_device_free(d); + kobject_put(&d->kobj); +} + +static void flash_dev_flush(struct closure *cl) +{ + struct bcache_device *d = container_of(cl, struct bcache_device, cl); + + bcache_device_unlink(d); + kobject_del(&d->kobj); + continue_at(cl, flash_dev_free, system_wq); +} + +static int flash_dev_run(struct cache_set *c, struct uuid_entry *u) +{ + struct bcache_device *d = kzalloc(sizeof(struct bcache_device), + GFP_KERNEL); + if (!d) + return -ENOMEM; + + closure_init(&d->cl, NULL); + set_closure_fn(&d->cl, flash_dev_flush, system_wq); + + kobject_init(&d->kobj, &bch_flash_dev_ktype); + + if (bcache_device_init(d, block_bytes(c))) + goto err; + + bcache_device_attach(d, c, u - c->uuids); + set_capacity(d->disk, u->sectors); + bch_flash_dev_request_init(d); + add_disk(d->disk); + + if (kobject_add(&d->kobj, &disk_to_dev(d->disk)->kobj, "bcache")) + goto err; + + bcache_device_link(d, c, "volume"); + + return 0; +err: + kobject_put(&d->kobj); + return -ENOMEM; +} + +static int flash_devs_run(struct cache_set *c) +{ + int ret = 0; + struct uuid_entry *u; + + for (u = c->uuids; + u < c->uuids + c->nr_uuids && !ret; + u++) + if (UUID_FLASH_ONLY(u)) + ret = flash_dev_run(c, u); + + return ret; +} + +int bch_flash_dev_create(struct cache_set *c, uint64_t size) +{ + struct uuid_entry *u; + + if (test_bit(CACHE_SET_STOPPING, &c->flags)) + return -EINTR; + + u = uuid_find_empty(c); + if (!u) { + pr_err("Can't create volume, no room for UUID"); + return -EINVAL; + } + + get_random_bytes(u->uuid, 16); + memset(u->label, 0, 32); + u->first_reg = u->last_reg = cpu_to_le32(get_seconds()); + + SET_UUID_FLASH_ONLY(u, 1); + u->sectors = size >> 9; + + bch_uuid_write(c); + + return flash_dev_run(c, u); +} + +/* Cache set */ + +__printf(2, 3) +bool bch_cache_set_error(struct cache_set *c, const char *fmt, ...) +{ + va_list args; + + if (test_bit(CACHE_SET_STOPPING, &c->flags)) + return false; + + /* XXX: we can be called from atomic context + acquire_console_sem(); + */ + + printk(KERN_ERR "bcache: error on %pU: ", c->sb.set_uuid); + + va_start(args, fmt); + vprintk(fmt, args); + va_end(args); + + printk(", disabling caching\n"); + + bch_cache_set_unregister(c); + return true; +} + +void bch_cache_set_release(struct kobject *kobj) +{ + struct cache_set *c = container_of(kobj, struct cache_set, kobj); + kfree(c); + module_put(THIS_MODULE); +} + +static void cache_set_free(struct closure *cl) +{ + struct cache_set *c = container_of(cl, struct cache_set, cl); + struct cache *ca; + unsigned i; + + if (!IS_ERR_OR_NULL(c->debug)) + debugfs_remove(c->debug); + + bch_open_buckets_free(c); + bch_btree_cache_free(c); + bch_journal_free(c); + + for_each_cache(ca, c, i) + if (ca) + kobject_put(&ca->kobj); + + free_pages((unsigned long) c->uuids, ilog2(bucket_pages(c))); + free_pages((unsigned long) c->sort, ilog2(bucket_pages(c))); + + kfree(c->fill_iter); + if (c->bio_split) + bioset_free(c->bio_split); + if (c->bio_meta) + mempool_destroy(c->bio_meta); + if (c->search) + mempool_destroy(c->search); + kfree(c->devices); + + mutex_lock(&bch_register_lock); + list_del(&c->list); + mutex_unlock(&bch_register_lock); + + pr_info("Cache set %pU unregistered", c->sb.set_uuid); + wake_up(&unregister_wait); + + closure_debug_destroy(&c->cl); + kobject_put(&c->kobj); +} + +static void cache_set_flush(struct closure *cl) +{ + struct cache_set *c = container_of(cl, struct cache_set, caching); + struct btree *b; + + /* Shut down allocator threads */ + set_bit(CACHE_SET_STOPPING_2, &c->flags); + wake_up(&c->alloc_wait); + + bch_cache_accounting_destroy(&c->accounting); + + kobject_put(&c->internal); + kobject_del(&c->kobj); + + if (!IS_ERR_OR_NULL(c->root)) + list_add(&c->root->list, &c->btree_cache); + + /* Should skip this if we're unregistering because of an error */ + list_for_each_entry(b, &c->btree_cache, list) + if (btree_node_dirty(b)) + bch_btree_write(b, true, NULL); + + closure_return(cl); +} + +static void __cache_set_unregister(struct closure *cl) +{ + struct cache_set *c = container_of(cl, struct cache_set, caching); + struct cached_dev *dc, *t; + size_t i; + + mutex_lock(&bch_register_lock); + + if (test_bit(CACHE_SET_UNREGISTERING, &c->flags)) + list_for_each_entry_safe(dc, t, &c->cached_devs, list) + bch_cached_dev_detach(dc); + + for (i = 0; i < c->nr_uuids; i++) + if (c->devices[i] && UUID_FLASH_ONLY(&c->uuids[i])) + bcache_device_stop(c->devices[i]); + + mutex_unlock(&bch_register_lock); + + continue_at(cl, cache_set_flush, system_wq); +} + +void bch_cache_set_stop(struct cache_set *c) +{ + if (!test_and_set_bit(CACHE_SET_STOPPING, &c->flags)) + closure_queue(&c->caching); +} + +void bch_cache_set_unregister(struct cache_set *c) +{ + set_bit(CACHE_SET_UNREGISTERING, &c->flags); + bch_cache_set_stop(c); +} + +#define alloc_bucket_pages(gfp, c) \ + ((void *) __get_free_pages(__GFP_ZERO|gfp, ilog2(bucket_pages(c)))) + +struct cache_set *bch_cache_set_alloc(struct cache_sb *sb) +{ + int iter_size; + struct cache_set *c = kzalloc(sizeof(struct cache_set), GFP_KERNEL); + if (!c) + return NULL; + + __module_get(THIS_MODULE); + closure_init(&c->cl, NULL); + set_closure_fn(&c->cl, cache_set_free, system_wq); + + closure_init(&c->caching, &c->cl); + set_closure_fn(&c->caching, __cache_set_unregister, system_wq); + + /* Maybe create continue_at_noreturn() and use it here? */ + closure_set_stopped(&c->cl); + closure_put(&c->cl); + + kobject_init(&c->kobj, &bch_cache_set_ktype); + kobject_init(&c->internal, &bch_cache_set_internal_ktype); + + bch_cache_accounting_init(&c->accounting, &c->cl); + + memcpy(c->sb.set_uuid, sb->set_uuid, 16); + c->sb.block_size = sb->block_size; + c->sb.bucket_size = sb->bucket_size; + c->sb.nr_in_set = sb->nr_in_set; + c->sb.last_mount = sb->last_mount; + c->bucket_bits = ilog2(sb->bucket_size); + c->block_bits = ilog2(sb->block_size); + c->nr_uuids = bucket_bytes(c) / sizeof(struct uuid_entry); + + c->btree_pages = c->sb.bucket_size / PAGE_SECTORS; + if (c->btree_pages > BTREE_MAX_PAGES) + c->btree_pages = max_t(int, c->btree_pages / 4, + BTREE_MAX_PAGES); + + init_waitqueue_head(&c->alloc_wait); + mutex_init(&c->bucket_lock); + mutex_init(&c->fill_lock); + mutex_init(&c->sort_lock); + spin_lock_init(&c->sort_time_lock); + closure_init_unlocked(&c->sb_write); + closure_init_unlocked(&c->uuid_write); + spin_lock_init(&c->btree_read_time_lock); + bch_moving_init_cache_set(c); + + INIT_LIST_HEAD(&c->list); + INIT_LIST_HEAD(&c->cached_devs); + INIT_LIST_HEAD(&c->btree_cache); + INIT_LIST_HEAD(&c->btree_cache_freeable); + INIT_LIST_HEAD(&c->btree_cache_freed); + INIT_LIST_HEAD(&c->data_buckets); + + c->search = mempool_create_slab_pool(32, bch_search_cache); + if (!c->search) + goto err; + + iter_size = (sb->bucket_size / sb->block_size + 1) * + sizeof(struct btree_iter_set); + + if (!(c->devices = kzalloc(c->nr_uuids * sizeof(void *), GFP_KERNEL)) || + !(c->bio_meta = mempool_create_kmalloc_pool(2, + sizeof(struct bbio) + sizeof(struct bio_vec) * + bucket_pages(c))) || + !(c->bio_split = bioset_create(4, offsetof(struct bbio, bio))) || + !(c->fill_iter = kmalloc(iter_size, GFP_KERNEL)) || + !(c->sort = alloc_bucket_pages(GFP_KERNEL, c)) || + !(c->uuids = alloc_bucket_pages(GFP_KERNEL, c)) || + bch_journal_alloc(c) || + bch_btree_cache_alloc(c) || + bch_open_buckets_alloc(c)) + goto err; + + c->fill_iter->size = sb->bucket_size / sb->block_size; + + c->congested_read_threshold_us = 2000; + c->congested_write_threshold_us = 20000; + c->error_limit = 8 << IO_ERROR_SHIFT; + + return c; +err: + bch_cache_set_unregister(c); + return NULL; +} + +static void run_cache_set(struct cache_set *c) +{ + const char *err = "cannot allocate memory"; + struct cached_dev *dc, *t; + struct cache *ca; + unsigned i; + + struct btree_op op; + bch_btree_op_init_stack(&op); + op.lock = SHRT_MAX; + + for_each_cache(ca, c, i) + c->nbuckets += ca->sb.nbuckets; + + if (CACHE_SYNC(&c->sb)) { + LIST_HEAD(journal); + struct bkey *k; + struct jset *j; + + err = "cannot allocate memory for journal"; + if (bch_journal_read(c, &journal, &op)) + goto err; + + pr_debug("btree_journal_read() done"); + + err = "no journal entries found"; + if (list_empty(&journal)) + goto err; + + j = &list_entry(journal.prev, struct journal_replay, list)->j; + + err = "IO error reading priorities"; + for_each_cache(ca, c, i) + prio_read(ca, j->prio_bucket[ca->sb.nr_this_dev]); + + /* + * If prio_read() fails it'll call cache_set_error and we'll + * tear everything down right away, but if we perhaps checked + * sooner we could avoid journal replay. + */ + + k = &j->btree_root; + + err = "bad btree root"; + if (__bch_ptr_invalid(c, j->btree_level + 1, k)) + goto err; + + err = "error reading btree root"; + c->root = bch_btree_node_get(c, k, j->btree_level, &op); + if (IS_ERR_OR_NULL(c->root)) + goto err; + + list_del_init(&c->root->list); + rw_unlock(true, c->root); + + err = uuid_read(c, j, &op.cl); + if (err) + goto err; + + err = "error in recovery"; + if (bch_btree_check(c, &op)) + goto err; + + bch_journal_mark(c, &journal); + bch_btree_gc_finish(c); + pr_debug("btree_check() done"); + + /* + * bcache_journal_next() can't happen sooner, or + * btree_gc_finish() will give spurious errors about last_gc > + * gc_gen - this is a hack but oh well. + */ + bch_journal_next(&c->journal); + + for_each_cache(ca, c, i) + closure_call(&ca->alloc, bch_allocator_thread, + system_wq, &c->cl); + + /* + * First place it's safe to allocate: btree_check() and + * btree_gc_finish() have to run before we have buckets to + * allocate, and bch_bucket_alloc_set() might cause a journal + * entry to be written so bcache_journal_next() has to be called + * first. + * + * If the uuids were in the old format we have to rewrite them + * before the next journal entry is written: + */ + if (j->version < BCACHE_JSET_VERSION_UUID) + __uuid_write(c); + + bch_journal_replay(c, &journal, &op); + } else { + pr_notice("invalidating existing data"); + /* Don't want invalidate_buckets() to queue a gc yet */ + closure_lock(&c->gc, NULL); + + for_each_cache(ca, c, i) { + unsigned j; + + ca->sb.keys = clamp_t(int, ca->sb.nbuckets >> 7, + 2, SB_JOURNAL_BUCKETS); + + for (j = 0; j < ca->sb.keys; j++) + ca->sb.d[j] = ca->sb.first_bucket + j; + } + + bch_btree_gc_finish(c); + + for_each_cache(ca, c, i) + closure_call(&ca->alloc, bch_allocator_thread, + ca->alloc_workqueue, &c->cl); + + mutex_lock(&c->bucket_lock); + for_each_cache(ca, c, i) + bch_prio_write(ca); + mutex_unlock(&c->bucket_lock); + + wake_up(&c->alloc_wait); + + err = "cannot allocate new UUID bucket"; + if (__uuid_write(c)) + goto err_unlock_gc; + + err = "cannot allocate new btree root"; + c->root = bch_btree_node_alloc(c, 0, &op.cl); + if (IS_ERR_OR_NULL(c->root)) + goto err_unlock_gc; + + bkey_copy_key(&c->root->key, &MAX_KEY); + bch_btree_write(c->root, true, &op); + + bch_btree_set_root(c->root); + rw_unlock(true, c->root); + + /* + * We don't want to write the first journal entry until + * everything is set up - fortunately journal entries won't be + * written until the SET_CACHE_SYNC() here: + */ + SET_CACHE_SYNC(&c->sb, true); + + bch_journal_next(&c->journal); + bch_journal_meta(c, &op.cl); + + /* Unlock */ + closure_set_stopped(&c->gc.cl); + closure_put(&c->gc.cl); + } + + closure_sync(&op.cl); + c->sb.last_mount = get_seconds(); + bcache_write_super(c); + + list_for_each_entry_safe(dc, t, &uncached_devices, list) + bch_cached_dev_attach(dc, c); + + flash_devs_run(c); + + return; +err_unlock_gc: + closure_set_stopped(&c->gc.cl); + closure_put(&c->gc.cl); +err: + closure_sync(&op.cl); + /* XXX: test this, it's broken */ + bch_cache_set_error(c, err); +} + +static bool can_attach_cache(struct cache *ca, struct cache_set *c) +{ + return ca->sb.block_size == c->sb.block_size && + ca->sb.bucket_size == c->sb.block_size && + ca->sb.nr_in_set == c->sb.nr_in_set; +} + +static const char *register_cache_set(struct cache *ca) +{ + char buf[12]; + const char *err = "cannot allocate memory"; + struct cache_set *c; + + list_for_each_entry(c, &bch_cache_sets, list) + if (!memcmp(c->sb.set_uuid, ca->sb.set_uuid, 16)) { + if (c->cache[ca->sb.nr_this_dev]) + return "duplicate cache set member"; + + if (!can_attach_cache(ca, c)) + return "cache sb does not match set"; + + if (!CACHE_SYNC(&ca->sb)) + SET_CACHE_SYNC(&c->sb, false); + + goto found; + } + + c = bch_cache_set_alloc(&ca->sb); + if (!c) + return err; + + err = "error creating kobject"; + if (kobject_add(&c->kobj, bcache_kobj, "%pU", c->sb.set_uuid) || + kobject_add(&c->internal, &c->kobj, "internal")) + goto err; + + if (bch_cache_accounting_add_kobjs(&c->accounting, &c->kobj)) + goto err; + + bch_debug_init_cache_set(c); + + list_add(&c->list, &bch_cache_sets); +found: + sprintf(buf, "cache%i", ca->sb.nr_this_dev); + if (sysfs_create_link(&ca->kobj, &c->kobj, "set") || + sysfs_create_link(&c->kobj, &ca->kobj, buf)) + goto err; + + if (ca->sb.seq > c->sb.seq) { + c->sb.version = ca->sb.version; + memcpy(c->sb.set_uuid, ca->sb.set_uuid, 16); + c->sb.flags = ca->sb.flags; + c->sb.seq = ca->sb.seq; + pr_debug("set version = %llu", c->sb.version); + } + + ca->set = c; + ca->set->cache[ca->sb.nr_this_dev] = ca; + c->cache_by_alloc[c->caches_loaded++] = ca; + + if (c->caches_loaded == c->sb.nr_in_set) + run_cache_set(c); + + return NULL; +err: + bch_cache_set_unregister(c); + return err; +} + +/* Cache device */ + +void bch_cache_release(struct kobject *kobj) +{ + struct cache *ca = container_of(kobj, struct cache, kobj); + + if (ca->set) + ca->set->cache[ca->sb.nr_this_dev] = NULL; + + bch_cache_allocator_exit(ca); + + bio_split_pool_free(&ca->bio_split_hook); + + if (ca->alloc_workqueue) + destroy_workqueue(ca->alloc_workqueue); + + free_pages((unsigned long) ca->disk_buckets, ilog2(bucket_pages(ca))); + kfree(ca->prio_buckets); + vfree(ca->buckets); + + free_heap(&ca->heap); + free_fifo(&ca->unused); + free_fifo(&ca->free_inc); + free_fifo(&ca->free); + + if (ca->sb_bio.bi_inline_vecs[0].bv_page) + put_page(ca->sb_bio.bi_io_vec[0].bv_page); + + if (!IS_ERR_OR_NULL(ca->bdev)) { + blk_sync_queue(bdev_get_queue(ca->bdev)); + blkdev_put(ca->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL); + } + + kfree(ca); + module_put(THIS_MODULE); +} + +static int cache_alloc(struct cache_sb *sb, struct cache *ca) +{ + size_t free; + struct bucket *b; + + if (!ca) + return -ENOMEM; + + __module_get(THIS_MODULE); + kobject_init(&ca->kobj, &bch_cache_ktype); + + memcpy(&ca->sb, sb, sizeof(struct cache_sb)); + + INIT_LIST_HEAD(&ca->discards); + + bio_init(&ca->sb_bio); + ca->sb_bio.bi_max_vecs = 1; + ca->sb_bio.bi_io_vec = ca->sb_bio.bi_inline_vecs; + + bio_init(&ca->journal.bio); + ca->journal.bio.bi_max_vecs = 8; + ca->journal.bio.bi_io_vec = ca->journal.bio.bi_inline_vecs; + + free = roundup_pow_of_two(ca->sb.nbuckets) >> 9; + free = max_t(size_t, free, (prio_buckets(ca) + 8) * 2); + + if (!init_fifo(&ca->free, free, GFP_KERNEL) || + !init_fifo(&ca->free_inc, free << 2, GFP_KERNEL) || + !init_fifo(&ca->unused, free << 2, GFP_KERNEL) || + !init_heap(&ca->heap, free << 3, GFP_KERNEL) || + !(ca->buckets = vmalloc(sizeof(struct bucket) * + ca->sb.nbuckets)) || + !(ca->prio_buckets = kzalloc(sizeof(uint64_t) * prio_buckets(ca) * + 2, GFP_KERNEL)) || + !(ca->disk_buckets = alloc_bucket_pages(GFP_KERNEL, ca)) || + !(ca->alloc_workqueue = alloc_workqueue("bch_allocator", 0, 1)) || + bio_split_pool_init(&ca->bio_split_hook)) + goto err; + + ca->prio_last_buckets = ca->prio_buckets + prio_buckets(ca); + + memset(ca->buckets, 0, ca->sb.nbuckets * sizeof(struct bucket)); + for_each_bucket(b, ca) + atomic_set(&b->pin, 0); + + if (bch_cache_allocator_init(ca)) + goto err; + + return 0; +err: + kobject_put(&ca->kobj); + return -ENOMEM; +} + +static const char *register_cache(struct cache_sb *sb, struct page *sb_page, + struct block_device *bdev, struct cache *ca) +{ + char name[BDEVNAME_SIZE]; + const char *err = "cannot allocate memory"; + + if (cache_alloc(sb, ca) != 0) + return err; + + ca->sb_bio.bi_io_vec[0].bv_page = sb_page; + ca->bdev = bdev; + ca->bdev->bd_holder = ca; + + if (blk_queue_discard(bdev_get_queue(ca->bdev))) + ca->discard = CACHE_DISCARD(&ca->sb); + + err = "error creating kobject"; + if (kobject_add(&ca->kobj, &part_to_dev(bdev->bd_part)->kobj, "bcache")) + goto err; + + err = register_cache_set(ca); + if (err) + goto err; + + pr_info("registered cache device %s", bdevname(bdev, name)); + + return NULL; +err: + kobject_put(&ca->kobj); + pr_info("error opening %s: %s", bdevname(bdev, name), err); + /* Return NULL instead of an error because kobject_put() cleans + * everything up + */ + return NULL; +} + +/* Global interfaces/init */ + +static ssize_t register_bcache(struct kobject *, struct kobj_attribute *, + const char *, size_t); + +kobj_attribute_write(register, register_bcache); +kobj_attribute_write(register_quiet, register_bcache); + +static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, + const char *buffer, size_t size) +{ + ssize_t ret = size; + const char *err = "cannot allocate memory"; + char *path = NULL; + struct cache_sb *sb = NULL; + struct block_device *bdev = NULL; + struct page *sb_page = NULL; + + if (!try_module_get(THIS_MODULE)) + return -EBUSY; + + mutex_lock(&bch_register_lock); + + if (!(path = kstrndup(buffer, size, GFP_KERNEL)) || + !(sb = kmalloc(sizeof(struct cache_sb), GFP_KERNEL))) + goto err; + + err = "failed to open device"; + bdev = blkdev_get_by_path(strim(path), + FMODE_READ|FMODE_WRITE|FMODE_EXCL, + sb); + if (bdev == ERR_PTR(-EBUSY)) + err = "device busy"; + + if (IS_ERR(bdev) || + set_blocksize(bdev, 4096)) + goto err; + + err = read_super(sb, bdev, &sb_page); + if (err) + goto err_close; + + if (SB_IS_BDEV(sb)) { + struct cached_dev *dc = kzalloc(sizeof(*dc), GFP_KERNEL); + + err = register_bdev(sb, sb_page, bdev, dc); + } else { + struct cache *ca = kzalloc(sizeof(*ca), GFP_KERNEL); + + err = register_cache(sb, sb_page, bdev, ca); + } + + if (err) { + /* register_(bdev|cache) will only return an error if they + * didn't get far enough to create the kobject - if they did, + * the kobject destructor will do this cleanup. + */ + put_page(sb_page); +err_close: + blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL); +err: + if (attr != &ksysfs_register_quiet) + pr_info("error opening %s: %s", path, err); + ret = -EINVAL; + } + + kfree(sb); + kfree(path); + mutex_unlock(&bch_register_lock); + module_put(THIS_MODULE); + return ret; +} + +static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x) +{ + if (code == SYS_DOWN || + code == SYS_HALT || + code == SYS_POWER_OFF) { + DEFINE_WAIT(wait); + unsigned long start = jiffies; + bool stopped = false; + + struct cache_set *c, *tc; + struct cached_dev *dc, *tdc; + + mutex_lock(&bch_register_lock); + + if (list_empty(&bch_cache_sets) && + list_empty(&uncached_devices)) + goto out; + + pr_info("Stopping all devices:"); + + list_for_each_entry_safe(c, tc, &bch_cache_sets, list) + bch_cache_set_stop(c); + + list_for_each_entry_safe(dc, tdc, &uncached_devices, list) + bcache_device_stop(&dc->disk); + + /* What's a condition variable? */ + while (1) { + long timeout = start + 2 * HZ - jiffies; + + stopped = list_empty(&bch_cache_sets) && + list_empty(&uncached_devices); + + if (timeout < 0 || stopped) + break; + + prepare_to_wait(&unregister_wait, &wait, + TASK_UNINTERRUPTIBLE); + + mutex_unlock(&bch_register_lock); + schedule_timeout(timeout); + mutex_lock(&bch_register_lock); + } + + finish_wait(&unregister_wait, &wait); + + if (stopped) + pr_info("All devices stopped"); + else + pr_notice("Timeout waiting for devices to be closed"); +out: + mutex_unlock(&bch_register_lock); + } + + return NOTIFY_DONE; +} + +static struct notifier_block reboot = { + .notifier_call = bcache_reboot, + .priority = INT_MAX, /* before any real devices */ +}; + +static void bcache_exit(void) +{ + bch_debug_exit(); + bch_writeback_exit(); + bch_request_exit(); + bch_btree_exit(); + if (bcache_kobj) + kobject_put(bcache_kobj); + if (bcache_wq) + destroy_workqueue(bcache_wq); + unregister_blkdev(bcache_major, "bcache"); + unregister_reboot_notifier(&reboot); +} + +static int __init bcache_init(void) +{ + static const struct attribute *files[] = { + &ksysfs_register.attr, + &ksysfs_register_quiet.attr, + NULL + }; + + mutex_init(&bch_register_lock); + init_waitqueue_head(&unregister_wait); + register_reboot_notifier(&reboot); + closure_debug_init(); + + bcache_major = register_blkdev(0, "bcache"); + if (bcache_major < 0) + return bcache_major; + + if (!(bcache_wq = create_workqueue("bcache")) || + !(bcache_kobj = kobject_create_and_add("bcache", fs_kobj)) || + sysfs_create_files(bcache_kobj, files) || + bch_btree_init() || + bch_request_init() || + bch_writeback_init() || + bch_debug_init(bcache_kobj)) + goto err; + + return 0; +err: + bcache_exit(); + return -ENOMEM; +} + +module_exit(bcache_exit); +module_init(bcache_init); diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c new file mode 100644 index 000000000000..4d9cca47e4c6 --- /dev/null +++ b/drivers/md/bcache/sysfs.c @@ -0,0 +1,817 @@ +/* + * bcache sysfs interfaces + * + * Copyright 2010, 2011 Kent Overstreet + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "sysfs.h" +#include "btree.h" +#include "request.h" + +#include + +static const char * const cache_replacement_policies[] = { + "lru", + "fifo", + "random", + NULL +}; + +write_attribute(attach); +write_attribute(detach); +write_attribute(unregister); +write_attribute(stop); +write_attribute(clear_stats); +write_attribute(trigger_gc); +write_attribute(prune_cache); +write_attribute(flash_vol_create); + +read_attribute(bucket_size); +read_attribute(block_size); +read_attribute(nbuckets); +read_attribute(tree_depth); +read_attribute(root_usage_percent); +read_attribute(priority_stats); +read_attribute(btree_cache_size); +read_attribute(btree_cache_max_chain); +read_attribute(cache_available_percent); +read_attribute(written); +read_attribute(btree_written); +read_attribute(metadata_written); +read_attribute(active_journal_entries); + +sysfs_time_stats_attribute(btree_gc, sec, ms); +sysfs_time_stats_attribute(btree_split, sec, us); +sysfs_time_stats_attribute(btree_sort, ms, us); +sysfs_time_stats_attribute(btree_read, ms, us); +sysfs_time_stats_attribute(try_harder, ms, us); + +read_attribute(btree_nodes); +read_attribute(btree_used_percent); +read_attribute(average_key_size); +read_attribute(dirty_data); +read_attribute(bset_tree_stats); + +read_attribute(state); +read_attribute(cache_read_races); +read_attribute(writeback_keys_done); +read_attribute(writeback_keys_failed); +read_attribute(io_errors); +read_attribute(congested); +rw_attribute(congested_read_threshold_us); +rw_attribute(congested_write_threshold_us); + +rw_attribute(sequential_cutoff); +rw_attribute(sequential_merge); +rw_attribute(data_csum); +rw_attribute(cache_mode); +rw_attribute(writeback_metadata); +rw_attribute(writeback_running); +rw_attribute(writeback_percent); +rw_attribute(writeback_delay); +rw_attribute(writeback_rate); + +rw_attribute(writeback_rate_update_seconds); +rw_attribute(writeback_rate_d_term); +rw_attribute(writeback_rate_p_term_inverse); +rw_attribute(writeback_rate_d_smooth); +read_attribute(writeback_rate_debug); + +rw_attribute(synchronous); +rw_attribute(journal_delay_ms); +rw_attribute(discard); +rw_attribute(running); +rw_attribute(label); +rw_attribute(readahead); +rw_attribute(io_error_limit); +rw_attribute(io_error_halflife); +rw_attribute(verify); +rw_attribute(key_merging_disabled); +rw_attribute(gc_always_rewrite); +rw_attribute(freelist_percent); +rw_attribute(cache_replacement_policy); +rw_attribute(btree_shrinker_disabled); +rw_attribute(copy_gc_enabled); +rw_attribute(size); + +SHOW(__bch_cached_dev) +{ + struct cached_dev *dc = container_of(kobj, struct cached_dev, + disk.kobj); + const char *states[] = { "no cache", "clean", "dirty", "inconsistent" }; + +#define var(stat) (dc->stat) + + if (attr == &sysfs_cache_mode) + return bch_snprint_string_list(buf, PAGE_SIZE, + bch_cache_modes + 1, + BDEV_CACHE_MODE(&dc->sb)); + + sysfs_printf(data_csum, "%i", dc->disk.data_csum); + var_printf(verify, "%i"); + var_printf(writeback_metadata, "%i"); + var_printf(writeback_running, "%i"); + var_print(writeback_delay); + var_print(writeback_percent); + sysfs_print(writeback_rate, dc->writeback_rate.rate); + + var_print(writeback_rate_update_seconds); + var_print(writeback_rate_d_term); + var_print(writeback_rate_p_term_inverse); + var_print(writeback_rate_d_smooth); + + if (attr == &sysfs_writeback_rate_debug) { + char dirty[20]; + char derivative[20]; + char target[20]; + bch_hprint(dirty, + atomic_long_read(&dc->disk.sectors_dirty) << 9); + bch_hprint(derivative, dc->writeback_rate_derivative << 9); + bch_hprint(target, dc->writeback_rate_target << 9); + + return sprintf(buf, + "rate:\t\t%u\n" + "change:\t\t%i\n" + "dirty:\t\t%s\n" + "derivative:\t%s\n" + "target:\t\t%s\n", + dc->writeback_rate.rate, + dc->writeback_rate_change, + dirty, derivative, target); + } + + sysfs_hprint(dirty_data, + atomic_long_read(&dc->disk.sectors_dirty) << 9); + + var_printf(sequential_merge, "%i"); + var_hprint(sequential_cutoff); + var_hprint(readahead); + + sysfs_print(running, atomic_read(&dc->running)); + sysfs_print(state, states[BDEV_STATE(&dc->sb)]); + + if (attr == &sysfs_label) { + memcpy(buf, dc->sb.label, SB_LABEL_SIZE); + buf[SB_LABEL_SIZE + 1] = '\0'; + strcat(buf, "\n"); + return strlen(buf); + } + +#undef var + return 0; +} +SHOW_LOCKED(bch_cached_dev) + +STORE(__cached_dev) +{ + struct cached_dev *dc = container_of(kobj, struct cached_dev, + disk.kobj); + unsigned v = size; + struct cache_set *c; + +#define d_strtoul(var) sysfs_strtoul(var, dc->var) +#define d_strtoi_h(var) sysfs_hatoi(var, dc->var) + + sysfs_strtoul(data_csum, dc->disk.data_csum); + d_strtoul(verify); + d_strtoul(writeback_metadata); + d_strtoul(writeback_running); + d_strtoul(writeback_delay); + sysfs_strtoul_clamp(writeback_rate, + dc->writeback_rate.rate, 1, 1000000); + sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40); + + d_strtoul(writeback_rate_update_seconds); + d_strtoul(writeback_rate_d_term); + d_strtoul(writeback_rate_p_term_inverse); + sysfs_strtoul_clamp(writeback_rate_p_term_inverse, + dc->writeback_rate_p_term_inverse, 1, INT_MAX); + d_strtoul(writeback_rate_d_smooth); + + d_strtoul(sequential_merge); + d_strtoi_h(sequential_cutoff); + d_strtoi_h(readahead); + + if (attr == &sysfs_clear_stats) + bch_cache_accounting_clear(&dc->accounting); + + if (attr == &sysfs_running && + strtoul_or_return(buf)) + bch_cached_dev_run(dc); + + if (attr == &sysfs_cache_mode) { + ssize_t v = bch_read_string_list(buf, bch_cache_modes + 1); + + if (v < 0) + return v; + + if ((unsigned) v != BDEV_CACHE_MODE(&dc->sb)) { + SET_BDEV_CACHE_MODE(&dc->sb, v); + bch_write_bdev_super(dc, NULL); + } + } + + if (attr == &sysfs_label) { + memcpy(dc->sb.label, buf, SB_LABEL_SIZE); + bch_write_bdev_super(dc, NULL); + if (dc->disk.c) { + memcpy(dc->disk.c->uuids[dc->disk.id].label, + buf, SB_LABEL_SIZE); + bch_uuid_write(dc->disk.c); + } + } + + if (attr == &sysfs_attach) { + if (bch_parse_uuid(buf, dc->sb.set_uuid) < 16) + return -EINVAL; + + list_for_each_entry(c, &bch_cache_sets, list) { + v = bch_cached_dev_attach(dc, c); + if (!v) + return size; + } + + pr_err("Can't attach %s: cache set not found", buf); + size = v; + } + + if (attr == &sysfs_detach && dc->disk.c) + bch_cached_dev_detach(dc); + + if (attr == &sysfs_stop) + bcache_device_stop(&dc->disk); + + return size; +} + +STORE(bch_cached_dev) +{ + struct cached_dev *dc = container_of(kobj, struct cached_dev, + disk.kobj); + + mutex_lock(&bch_register_lock); + size = __cached_dev_store(kobj, attr, buf, size); + + if (attr == &sysfs_writeback_running) + bch_writeback_queue(dc); + + if (attr == &sysfs_writeback_percent) + schedule_delayed_work(&dc->writeback_rate_update, + dc->writeback_rate_update_seconds * HZ); + + mutex_unlock(&bch_register_lock); + return size; +} + +static struct attribute *bch_cached_dev_files[] = { + &sysfs_attach, + &sysfs_detach, + &sysfs_stop, +#if 0 + &sysfs_data_csum, +#endif + &sysfs_cache_mode, + &sysfs_writeback_metadata, + &sysfs_writeback_running, + &sysfs_writeback_delay, + &sysfs_writeback_percent, + &sysfs_writeback_rate, + &sysfs_writeback_rate_update_seconds, + &sysfs_writeback_rate_d_term, + &sysfs_writeback_rate_p_term_inverse, + &sysfs_writeback_rate_d_smooth, + &sysfs_writeback_rate_debug, + &sysfs_dirty_data, + &sysfs_sequential_cutoff, + &sysfs_sequential_merge, + &sysfs_clear_stats, + &sysfs_running, + &sysfs_state, + &sysfs_label, + &sysfs_readahead, +#ifdef CONFIG_BCACHE_DEBUG + &sysfs_verify, +#endif + NULL +}; +KTYPE(bch_cached_dev); + +SHOW(bch_flash_dev) +{ + struct bcache_device *d = container_of(kobj, struct bcache_device, + kobj); + struct uuid_entry *u = &d->c->uuids[d->id]; + + sysfs_printf(data_csum, "%i", d->data_csum); + sysfs_hprint(size, u->sectors << 9); + + if (attr == &sysfs_label) { + memcpy(buf, u->label, SB_LABEL_SIZE); + buf[SB_LABEL_SIZE + 1] = '\0'; + strcat(buf, "\n"); + return strlen(buf); + } + + return 0; +} + +STORE(__bch_flash_dev) +{ + struct bcache_device *d = container_of(kobj, struct bcache_device, + kobj); + struct uuid_entry *u = &d->c->uuids[d->id]; + + sysfs_strtoul(data_csum, d->data_csum); + + if (attr == &sysfs_size) { + uint64_t v; + strtoi_h_or_return(buf, v); + + u->sectors = v >> 9; + bch_uuid_write(d->c); + set_capacity(d->disk, u->sectors); + } + + if (attr == &sysfs_label) { + memcpy(u->label, buf, SB_LABEL_SIZE); + bch_uuid_write(d->c); + } + + if (attr == &sysfs_unregister) { + atomic_set(&d->detaching, 1); + bcache_device_stop(d); + } + + return size; +} +STORE_LOCKED(bch_flash_dev) + +static struct attribute *bch_flash_dev_files[] = { + &sysfs_unregister, +#if 0 + &sysfs_data_csum, +#endif + &sysfs_label, + &sysfs_size, + NULL +}; +KTYPE(bch_flash_dev); + +SHOW(__bch_cache_set) +{ + unsigned root_usage(struct cache_set *c) + { + unsigned bytes = 0; + struct bkey *k; + struct btree *b; + struct btree_iter iter; + + goto lock_root; + + do { + rw_unlock(false, b); +lock_root: + b = c->root; + rw_lock(false, b, b->level); + } while (b != c->root); + + for_each_key_filter(b, k, &iter, bch_ptr_bad) + bytes += bkey_bytes(k); + + rw_unlock(false, b); + + return (bytes * 100) / btree_bytes(c); + } + + size_t cache_size(struct cache_set *c) + { + size_t ret = 0; + struct btree *b; + + mutex_lock(&c->bucket_lock); + list_for_each_entry(b, &c->btree_cache, list) + ret += 1 << (b->page_order + PAGE_SHIFT); + + mutex_unlock(&c->bucket_lock); + return ret; + } + + unsigned cache_max_chain(struct cache_set *c) + { + unsigned ret = 0; + struct hlist_head *h; + + mutex_lock(&c->bucket_lock); + + for (h = c->bucket_hash; + h < c->bucket_hash + (1 << BUCKET_HASH_BITS); + h++) { + unsigned i = 0; + struct hlist_node *p; + + hlist_for_each(p, h) + i++; + + ret = max(ret, i); + } + + mutex_unlock(&c->bucket_lock); + return ret; + } + + unsigned btree_used(struct cache_set *c) + { + return div64_u64(c->gc_stats.key_bytes * 100, + (c->gc_stats.nodes ?: 1) * btree_bytes(c)); + } + + unsigned average_key_size(struct cache_set *c) + { + return c->gc_stats.nkeys + ? div64_u64(c->gc_stats.data, c->gc_stats.nkeys) + : 0; + } + + struct cache_set *c = container_of(kobj, struct cache_set, kobj); + + sysfs_print(synchronous, CACHE_SYNC(&c->sb)); + sysfs_print(journal_delay_ms, c->journal_delay_ms); + sysfs_hprint(bucket_size, bucket_bytes(c)); + sysfs_hprint(block_size, block_bytes(c)); + sysfs_print(tree_depth, c->root->level); + sysfs_print(root_usage_percent, root_usage(c)); + + sysfs_hprint(btree_cache_size, cache_size(c)); + sysfs_print(btree_cache_max_chain, cache_max_chain(c)); + sysfs_print(cache_available_percent, 100 - c->gc_stats.in_use); + + sysfs_print_time_stats(&c->btree_gc_time, btree_gc, sec, ms); + sysfs_print_time_stats(&c->btree_split_time, btree_split, sec, us); + sysfs_print_time_stats(&c->sort_time, btree_sort, ms, us); + sysfs_print_time_stats(&c->btree_read_time, btree_read, ms, us); + sysfs_print_time_stats(&c->try_harder_time, try_harder, ms, us); + + sysfs_print(btree_used_percent, btree_used(c)); + sysfs_print(btree_nodes, c->gc_stats.nodes); + sysfs_hprint(dirty_data, c->gc_stats.dirty); + sysfs_hprint(average_key_size, average_key_size(c)); + + sysfs_print(cache_read_races, + atomic_long_read(&c->cache_read_races)); + + sysfs_print(writeback_keys_done, + atomic_long_read(&c->writeback_keys_done)); + sysfs_print(writeback_keys_failed, + atomic_long_read(&c->writeback_keys_failed)); + + /* See count_io_errors for why 88 */ + sysfs_print(io_error_halflife, c->error_decay * 88); + sysfs_print(io_error_limit, c->error_limit >> IO_ERROR_SHIFT); + + sysfs_hprint(congested, + ((uint64_t) bch_get_congested(c)) << 9); + sysfs_print(congested_read_threshold_us, + c->congested_read_threshold_us); + sysfs_print(congested_write_threshold_us, + c->congested_write_threshold_us); + + sysfs_print(active_journal_entries, fifo_used(&c->journal.pin)); + sysfs_printf(verify, "%i", c->verify); + sysfs_printf(key_merging_disabled, "%i", c->key_merging_disabled); + sysfs_printf(gc_always_rewrite, "%i", c->gc_always_rewrite); + sysfs_printf(btree_shrinker_disabled, "%i", c->shrinker_disabled); + sysfs_printf(copy_gc_enabled, "%i", c->copy_gc_enabled); + + if (attr == &sysfs_bset_tree_stats) + return bch_bset_print_stats(c, buf); + + return 0; +} +SHOW_LOCKED(bch_cache_set) + +STORE(__bch_cache_set) +{ + struct cache_set *c = container_of(kobj, struct cache_set, kobj); + + if (attr == &sysfs_unregister) + bch_cache_set_unregister(c); + + if (attr == &sysfs_stop) + bch_cache_set_stop(c); + + if (attr == &sysfs_synchronous) { + bool sync = strtoul_or_return(buf); + + if (sync != CACHE_SYNC(&c->sb)) { + SET_CACHE_SYNC(&c->sb, sync); + bcache_write_super(c); + } + } + + if (attr == &sysfs_flash_vol_create) { + int r; + uint64_t v; + strtoi_h_or_return(buf, v); + + r = bch_flash_dev_create(c, v); + if (r) + return r; + } + + if (attr == &sysfs_clear_stats) { + atomic_long_set(&c->writeback_keys_done, 0); + atomic_long_set(&c->writeback_keys_failed, 0); + + memset(&c->gc_stats, 0, sizeof(struct gc_stat)); + bch_cache_accounting_clear(&c->accounting); + } + + if (attr == &sysfs_trigger_gc) + bch_queue_gc(c); + + if (attr == &sysfs_prune_cache) { + struct shrink_control sc; + sc.gfp_mask = GFP_KERNEL; + sc.nr_to_scan = strtoul_or_return(buf); + c->shrink.shrink(&c->shrink, &sc); + } + + sysfs_strtoul(congested_read_threshold_us, + c->congested_read_threshold_us); + sysfs_strtoul(congested_write_threshold_us, + c->congested_write_threshold_us); + + if (attr == &sysfs_io_error_limit) + c->error_limit = strtoul_or_return(buf) << IO_ERROR_SHIFT; + + /* See count_io_errors() for why 88 */ + if (attr == &sysfs_io_error_halflife) + c->error_decay = strtoul_or_return(buf) / 88; + + sysfs_strtoul(journal_delay_ms, c->journal_delay_ms); + sysfs_strtoul(verify, c->verify); + sysfs_strtoul(key_merging_disabled, c->key_merging_disabled); + sysfs_strtoul(gc_always_rewrite, c->gc_always_rewrite); + sysfs_strtoul(btree_shrinker_disabled, c->shrinker_disabled); + sysfs_strtoul(copy_gc_enabled, c->copy_gc_enabled); + + return size; +} +STORE_LOCKED(bch_cache_set) + +SHOW(bch_cache_set_internal) +{ + struct cache_set *c = container_of(kobj, struct cache_set, internal); + return bch_cache_set_show(&c->kobj, attr, buf); +} + +STORE(bch_cache_set_internal) +{ + struct cache_set *c = container_of(kobj, struct cache_set, internal); + return bch_cache_set_store(&c->kobj, attr, buf, size); +} + +static void bch_cache_set_internal_release(struct kobject *k) +{ +} + +static struct attribute *bch_cache_set_files[] = { + &sysfs_unregister, + &sysfs_stop, + &sysfs_synchronous, + &sysfs_journal_delay_ms, + &sysfs_flash_vol_create, + + &sysfs_bucket_size, + &sysfs_block_size, + &sysfs_tree_depth, + &sysfs_root_usage_percent, + &sysfs_btree_cache_size, + &sysfs_cache_available_percent, + + &sysfs_average_key_size, + &sysfs_dirty_data, + + &sysfs_io_error_limit, + &sysfs_io_error_halflife, + &sysfs_congested, + &sysfs_congested_read_threshold_us, + &sysfs_congested_write_threshold_us, + &sysfs_clear_stats, + NULL +}; +KTYPE(bch_cache_set); + +static struct attribute *bch_cache_set_internal_files[] = { + &sysfs_active_journal_entries, + + sysfs_time_stats_attribute_list(btree_gc, sec, ms) + sysfs_time_stats_attribute_list(btree_split, sec, us) + sysfs_time_stats_attribute_list(btree_sort, ms, us) + sysfs_time_stats_attribute_list(btree_read, ms, us) + sysfs_time_stats_attribute_list(try_harder, ms, us) + + &sysfs_btree_nodes, + &sysfs_btree_used_percent, + &sysfs_btree_cache_max_chain, + + &sysfs_bset_tree_stats, + &sysfs_cache_read_races, + &sysfs_writeback_keys_done, + &sysfs_writeback_keys_failed, + + &sysfs_trigger_gc, + &sysfs_prune_cache, +#ifdef CONFIG_BCACHE_DEBUG + &sysfs_verify, + &sysfs_key_merging_disabled, +#endif + &sysfs_gc_always_rewrite, + &sysfs_btree_shrinker_disabled, + &sysfs_copy_gc_enabled, + NULL +}; +KTYPE(bch_cache_set_internal); + +SHOW(__bch_cache) +{ + struct cache *ca = container_of(kobj, struct cache, kobj); + + sysfs_hprint(bucket_size, bucket_bytes(ca)); + sysfs_hprint(block_size, block_bytes(ca)); + sysfs_print(nbuckets, ca->sb.nbuckets); + sysfs_print(discard, ca->discard); + sysfs_hprint(written, atomic_long_read(&ca->sectors_written) << 9); + sysfs_hprint(btree_written, + atomic_long_read(&ca->btree_sectors_written) << 9); + sysfs_hprint(metadata_written, + (atomic_long_read(&ca->meta_sectors_written) + + atomic_long_read(&ca->btree_sectors_written)) << 9); + + sysfs_print(io_errors, + atomic_read(&ca->io_errors) >> IO_ERROR_SHIFT); + + sysfs_print(freelist_percent, ca->free.size * 100 / + ((size_t) ca->sb.nbuckets)); + + if (attr == &sysfs_cache_replacement_policy) + return bch_snprint_string_list(buf, PAGE_SIZE, + cache_replacement_policies, + CACHE_REPLACEMENT(&ca->sb)); + + if (attr == &sysfs_priority_stats) { + int cmp(const void *l, const void *r) + { return *((uint16_t *) r) - *((uint16_t *) l); } + + /* Number of quantiles we compute */ + const unsigned nq = 31; + + size_t n = ca->sb.nbuckets, i, unused, btree; + uint64_t sum = 0; + uint16_t q[nq], *p, *cached; + ssize_t ret; + + cached = p = vmalloc(ca->sb.nbuckets * sizeof(uint16_t)); + if (!p) + return -ENOMEM; + + mutex_lock(&ca->set->bucket_lock); + for (i = ca->sb.first_bucket; i < n; i++) + p[i] = ca->buckets[i].prio; + mutex_unlock(&ca->set->bucket_lock); + + sort(p, n, sizeof(uint16_t), cmp, NULL); + + while (n && + !cached[n - 1]) + --n; + + unused = ca->sb.nbuckets - n; + + while (cached < p + n && + *cached == BTREE_PRIO) + cached++; + + btree = cached - p; + n -= btree; + + for (i = 0; i < n; i++) + sum += INITIAL_PRIO - cached[i]; + + if (n) + do_div(sum, n); + + for (i = 0; i < nq; i++) + q[i] = INITIAL_PRIO - cached[n * (i + 1) / (nq + 1)]; + + vfree(p); + + ret = snprintf(buf, PAGE_SIZE, + "Unused: %zu%%\n" + "Metadata: %zu%%\n" + "Average: %llu\n" + "Sectors per Q: %zu\n" + "Quantiles: [", + unused * 100 / (size_t) ca->sb.nbuckets, + btree * 100 / (size_t) ca->sb.nbuckets, sum, + n * ca->sb.bucket_size / (nq + 1)); + + for (i = 0; i < nq && ret < (ssize_t) PAGE_SIZE; i++) + ret += snprintf(buf + ret, PAGE_SIZE - ret, + i < nq - 1 ? "%u " : "%u]\n", q[i]); + + buf[PAGE_SIZE - 1] = '\0'; + return ret; + } + + return 0; +} +SHOW_LOCKED(bch_cache) + +STORE(__bch_cache) +{ + struct cache *ca = container_of(kobj, struct cache, kobj); + + if (attr == &sysfs_discard) { + bool v = strtoul_or_return(buf); + + if (blk_queue_discard(bdev_get_queue(ca->bdev))) + ca->discard = v; + + if (v != CACHE_DISCARD(&ca->sb)) { + SET_CACHE_DISCARD(&ca->sb, v); + bcache_write_super(ca->set); + } + } + + if (attr == &sysfs_cache_replacement_policy) { + ssize_t v = bch_read_string_list(buf, cache_replacement_policies); + + if (v < 0) + return v; + + if ((unsigned) v != CACHE_REPLACEMENT(&ca->sb)) { + mutex_lock(&ca->set->bucket_lock); + SET_CACHE_REPLACEMENT(&ca->sb, v); + mutex_unlock(&ca->set->bucket_lock); + + bcache_write_super(ca->set); + } + } + + if (attr == &sysfs_freelist_percent) { + DECLARE_FIFO(long, free); + long i; + size_t p = strtoul_or_return(buf); + + p = clamp_t(size_t, + ((size_t) ca->sb.nbuckets * p) / 100, + roundup_pow_of_two(ca->sb.nbuckets) >> 9, + ca->sb.nbuckets / 2); + + if (!init_fifo_exact(&free, p, GFP_KERNEL)) + return -ENOMEM; + + mutex_lock(&ca->set->bucket_lock); + + fifo_move(&free, &ca->free); + fifo_swap(&free, &ca->free); + + mutex_unlock(&ca->set->bucket_lock); + + while (fifo_pop(&free, i)) + atomic_dec(&ca->buckets[i].pin); + + free_fifo(&free); + } + + if (attr == &sysfs_clear_stats) { + atomic_long_set(&ca->sectors_written, 0); + atomic_long_set(&ca->btree_sectors_written, 0); + atomic_long_set(&ca->meta_sectors_written, 0); + atomic_set(&ca->io_count, 0); + atomic_set(&ca->io_errors, 0); + } + + return size; +} +STORE_LOCKED(bch_cache) + +static struct attribute *bch_cache_files[] = { + &sysfs_bucket_size, + &sysfs_block_size, + &sysfs_nbuckets, + &sysfs_priority_stats, + &sysfs_discard, + &sysfs_written, + &sysfs_btree_written, + &sysfs_metadata_written, + &sysfs_io_errors, + &sysfs_clear_stats, + &sysfs_freelist_percent, + &sysfs_cache_replacement_policy, + NULL +}; +KTYPE(bch_cache); diff --git a/drivers/md/bcache/sysfs.h b/drivers/md/bcache/sysfs.h new file mode 100644 index 000000000000..0526fe92a683 --- /dev/null +++ b/drivers/md/bcache/sysfs.h @@ -0,0 +1,110 @@ +#ifndef _BCACHE_SYSFS_H_ +#define _BCACHE_SYSFS_H_ + +#define KTYPE(type) \ +struct kobj_type type ## _ktype = { \ + .release = type ## _release, \ + .sysfs_ops = &((const struct sysfs_ops) { \ + .show = type ## _show, \ + .store = type ## _store \ + }), \ + .default_attrs = type ## _files \ +} + +#define SHOW(fn) \ +static ssize_t fn ## _show(struct kobject *kobj, struct attribute *attr,\ + char *buf) \ + +#define STORE(fn) \ +static ssize_t fn ## _store(struct kobject *kobj, struct attribute *attr,\ + const char *buf, size_t size) \ + +#define SHOW_LOCKED(fn) \ +SHOW(fn) \ +{ \ + ssize_t ret; \ + mutex_lock(&bch_register_lock); \ + ret = __ ## fn ## _show(kobj, attr, buf); \ + mutex_unlock(&bch_register_lock); \ + return ret; \ +} + +#define STORE_LOCKED(fn) \ +STORE(fn) \ +{ \ + ssize_t ret; \ + mutex_lock(&bch_register_lock); \ + ret = __ ## fn ## _store(kobj, attr, buf, size); \ + mutex_unlock(&bch_register_lock); \ + return ret; \ +} + +#define __sysfs_attribute(_name, _mode) \ + static struct attribute sysfs_##_name = \ + { .name = #_name, .mode = _mode } + +#define write_attribute(n) __sysfs_attribute(n, S_IWUSR) +#define read_attribute(n) __sysfs_attribute(n, S_IRUGO) +#define rw_attribute(n) __sysfs_attribute(n, S_IRUGO|S_IWUSR) + +#define sysfs_printf(file, fmt, ...) \ +do { \ + if (attr == &sysfs_ ## file) \ + return snprintf(buf, PAGE_SIZE, fmt "\n", __VA_ARGS__); \ +} while (0) + +#define sysfs_print(file, var) \ +do { \ + if (attr == &sysfs_ ## file) \ + return snprint(buf, PAGE_SIZE, var); \ +} while (0) + +#define sysfs_hprint(file, val) \ +do { \ + if (attr == &sysfs_ ## file) { \ + ssize_t ret = bch_hprint(buf, val); \ + strcat(buf, "\n"); \ + return ret + 1; \ + } \ +} while (0) + +#define var_printf(_var, fmt) sysfs_printf(_var, fmt, var(_var)) +#define var_print(_var) sysfs_print(_var, var(_var)) +#define var_hprint(_var) sysfs_hprint(_var, var(_var)) + +#define sysfs_strtoul(file, var) \ +do { \ + if (attr == &sysfs_ ## file) \ + return strtoul_safe(buf, var) ?: (ssize_t) size; \ +} while (0) + +#define sysfs_strtoul_clamp(file, var, min, max) \ +do { \ + if (attr == &sysfs_ ## file) \ + return strtoul_safe_clamp(buf, var, min, max) \ + ?: (ssize_t) size; \ +} while (0) + +#define strtoul_or_return(cp) \ +({ \ + unsigned long _v; \ + int _r = kstrtoul(cp, 10, &_v); \ + if (_r) \ + return _r; \ + _v; \ +}) + +#define strtoi_h_or_return(cp, v) \ +do { \ + int _r = strtoi_h(cp, &v); \ + if (_r) \ + return _r; \ +} while (0) + +#define sysfs_hatoi(file, var) \ +do { \ + if (attr == &sysfs_ ## file) \ + return strtoi_h(buf, &var) ?: (ssize_t) size; \ +} while (0) + +#endif /* _BCACHE_SYSFS_H_ */ diff --git a/drivers/md/bcache/trace.c b/drivers/md/bcache/trace.c new file mode 100644 index 000000000000..983f9bb411bc --- /dev/null +++ b/drivers/md/bcache/trace.c @@ -0,0 +1,26 @@ +#include "bcache.h" +#include "btree.h" +#include "request.h" + +#include + +#define CREATE_TRACE_POINTS +#include + +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_request_start); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_request_end); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_passthrough); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_cache_hit); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_cache_miss); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_read_retry); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_writethrough); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_writeback); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_write_skip); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_btree_read); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_btree_write); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_write_dirty); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_read_dirty); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_journal_write); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_cache_insert); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_gc_start); +EXPORT_TRACEPOINT_SYMBOL_GPL(bcache_gc_end); diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c new file mode 100644 index 000000000000..da3a99e85b1e --- /dev/null +++ b/drivers/md/bcache/util.c @@ -0,0 +1,377 @@ +/* + * random utiility code, for bcache but in theory not specific to bcache + * + * Copyright 2010, 2011 Kent Overstreet + * Copyright 2012 Google, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "util.h" + +#define simple_strtoint(c, end, base) simple_strtol(c, end, base) +#define simple_strtouint(c, end, base) simple_strtoul(c, end, base) + +#define STRTO_H(name, type) \ +int bch_ ## name ## _h(const char *cp, type *res) \ +{ \ + int u = 0; \ + char *e; \ + type i = simple_ ## name(cp, &e, 10); \ + \ + switch (tolower(*e)) { \ + default: \ + return -EINVAL; \ + case 'y': \ + case 'z': \ + u++; \ + case 'e': \ + u++; \ + case 'p': \ + u++; \ + case 't': \ + u++; \ + case 'g': \ + u++; \ + case 'm': \ + u++; \ + case 'k': \ + u++; \ + if (e++ == cp) \ + return -EINVAL; \ + case '\n': \ + case '\0': \ + if (*e == '\n') \ + e++; \ + } \ + \ + if (*e) \ + return -EINVAL; \ + \ + while (u--) { \ + if ((type) ~0 > 0 && \ + (type) ~0 / 1024 <= i) \ + return -EINVAL; \ + if ((i > 0 && ANYSINT_MAX(type) / 1024 < i) || \ + (i < 0 && -ANYSINT_MAX(type) / 1024 > i)) \ + return -EINVAL; \ + i *= 1024; \ + } \ + \ + *res = i; \ + return 0; \ +} \ + +STRTO_H(strtoint, int) +STRTO_H(strtouint, unsigned int) +STRTO_H(strtoll, long long) +STRTO_H(strtoull, unsigned long long) + +ssize_t bch_hprint(char *buf, int64_t v) +{ + static const char units[] = "?kMGTPEZY"; + char dec[4] = ""; + int u, t = 0; + + for (u = 0; v >= 1024 || v <= -1024; u++) { + t = v & ~(~0 << 10); + v >>= 10; + } + + if (!u) + return sprintf(buf, "%llu", v); + + if (v < 100 && v > -100) + snprintf(dec, sizeof(dec), ".%i", t / 100); + + return sprintf(buf, "%lli%s%c", v, dec, units[u]); +} + +ssize_t bch_snprint_string_list(char *buf, size_t size, const char * const list[], + size_t selected) +{ + char *out = buf; + size_t i; + + for (i = 0; list[i]; i++) + out += snprintf(out, buf + size - out, + i == selected ? "[%s] " : "%s ", list[i]); + + out[-1] = '\n'; + return out - buf; +} + +ssize_t bch_read_string_list(const char *buf, const char * const list[]) +{ + size_t i; + char *s, *d = kstrndup(buf, PAGE_SIZE - 1, GFP_KERNEL); + if (!d) + return -ENOMEM; + + s = strim(d); + + for (i = 0; list[i]; i++) + if (!strcmp(list[i], s)) + break; + + kfree(d); + + if (!list[i]) + return -EINVAL; + + return i; +} + +bool bch_is_zero(const char *p, size_t n) +{ + size_t i; + + for (i = 0; i < n; i++) + if (p[i]) + return false; + return true; +} + +int bch_parse_uuid(const char *s, char *uuid) +{ + size_t i, j, x; + memset(uuid, 0, 16); + + for (i = 0, j = 0; + i < strspn(s, "-0123456789:ABCDEFabcdef") && j < 32; + i++) { + x = s[i] | 32; + + switch (x) { + case '0'...'9': + x -= '0'; + break; + case 'a'...'f': + x -= 'a' - 10; + break; + default: + continue; + } + + if (!(j & 1)) + x <<= 4; + uuid[j++ >> 1] |= x; + } + return i; +} + +void bch_time_stats_update(struct time_stats *stats, uint64_t start_time) +{ + uint64_t now = local_clock(); + uint64_t duration = time_after64(now, start_time) + ? now - start_time : 0; + uint64_t last = time_after64(now, stats->last) + ? now - stats->last : 0; + + stats->max_duration = max(stats->max_duration, duration); + + if (stats->last) { + ewma_add(stats->average_duration, duration, 8, 8); + + if (stats->average_frequency) + ewma_add(stats->average_frequency, last, 8, 8); + else + stats->average_frequency = last << 8; + } else { + stats->average_duration = duration << 8; + } + + stats->last = now ?: 1; +} + +unsigned bch_next_delay(struct ratelimit *d, uint64_t done) +{ + uint64_t now = local_clock(); + + d->next += div_u64(done, d->rate); + + return time_after64(d->next, now) + ? div_u64(d->next - now, NSEC_PER_SEC / HZ) + : 0; +} + +void bch_bio_map(struct bio *bio, void *base) +{ + size_t size = bio->bi_size; + struct bio_vec *bv = bio->bi_io_vec; + + BUG_ON(!bio->bi_size); + BUG_ON(bio->bi_vcnt); + + bv->bv_offset = base ? ((unsigned long) base) % PAGE_SIZE : 0; + goto start; + + for (; size; bio->bi_vcnt++, bv++) { + bv->bv_offset = 0; +start: bv->bv_len = min_t(size_t, PAGE_SIZE - bv->bv_offset, + size); + if (base) { + bv->bv_page = is_vmalloc_addr(base) + ? vmalloc_to_page(base) + : virt_to_page(base); + + base += bv->bv_len; + } + + size -= bv->bv_len; + } +} + +int bch_bio_alloc_pages(struct bio *bio, gfp_t gfp) +{ + int i; + struct bio_vec *bv; + + bio_for_each_segment(bv, bio, i) { + bv->bv_page = alloc_page(gfp); + if (!bv->bv_page) { + while (bv-- != bio->bi_io_vec + bio->bi_idx) + __free_page(bv->bv_page); + return -ENOMEM; + } + } + + return 0; +} + +/* + * Portions Copyright (c) 1996-2001, PostgreSQL Global Development Group (Any + * use permitted, subject to terms of PostgreSQL license; see.) + + * If we have a 64-bit integer type, then a 64-bit CRC looks just like the + * usual sort of implementation. (See Ross Williams' excellent introduction + * A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from + * ftp://ftp.rocksoft.com/papers/crc_v3.txt or several other net sites.) + * If we have no working 64-bit type, then fake it with two 32-bit registers. + * + * The present implementation is a normal (not "reflected", in Williams' + * terms) 64-bit CRC, using initial all-ones register contents and a final + * bit inversion. The chosen polynomial is borrowed from the DLT1 spec + * (ECMA-182, available from http://www.ecma.ch/ecma1/STAND/ECMA-182.HTM): + * + * x^64 + x^62 + x^57 + x^55 + x^54 + x^53 + x^52 + x^47 + x^46 + x^45 + + * x^40 + x^39 + x^38 + x^37 + x^35 + x^33 + x^32 + x^31 + x^29 + x^27 + + * x^24 + x^23 + x^22 + x^21 + x^19 + x^17 + x^13 + x^12 + x^10 + x^9 + + * x^7 + x^4 + x + 1 +*/ + +static const uint64_t crc_table[256] = { + 0x0000000000000000ULL, 0x42F0E1EBA9EA3693ULL, 0x85E1C3D753D46D26ULL, + 0xC711223CFA3E5BB5ULL, 0x493366450E42ECDFULL, 0x0BC387AEA7A8DA4CULL, + 0xCCD2A5925D9681F9ULL, 0x8E224479F47CB76AULL, 0x9266CC8A1C85D9BEULL, + 0xD0962D61B56FEF2DULL, 0x17870F5D4F51B498ULL, 0x5577EEB6E6BB820BULL, + 0xDB55AACF12C73561ULL, 0x99A54B24BB2D03F2ULL, 0x5EB4691841135847ULL, + 0x1C4488F3E8F96ED4ULL, 0x663D78FF90E185EFULL, 0x24CD9914390BB37CULL, + 0xE3DCBB28C335E8C9ULL, 0xA12C5AC36ADFDE5AULL, 0x2F0E1EBA9EA36930ULL, + 0x6DFEFF5137495FA3ULL, 0xAAEFDD6DCD770416ULL, 0xE81F3C86649D3285ULL, + 0xF45BB4758C645C51ULL, 0xB6AB559E258E6AC2ULL, 0x71BA77A2DFB03177ULL, + 0x334A9649765A07E4ULL, 0xBD68D2308226B08EULL, 0xFF9833DB2BCC861DULL, + 0x388911E7D1F2DDA8ULL, 0x7A79F00C7818EB3BULL, 0xCC7AF1FF21C30BDEULL, + 0x8E8A101488293D4DULL, 0x499B3228721766F8ULL, 0x0B6BD3C3DBFD506BULL, + 0x854997BA2F81E701ULL, 0xC7B97651866BD192ULL, 0x00A8546D7C558A27ULL, + 0x4258B586D5BFBCB4ULL, 0x5E1C3D753D46D260ULL, 0x1CECDC9E94ACE4F3ULL, + 0xDBFDFEA26E92BF46ULL, 0x990D1F49C77889D5ULL, 0x172F5B3033043EBFULL, + 0x55DFBADB9AEE082CULL, 0x92CE98E760D05399ULL, 0xD03E790CC93A650AULL, + 0xAA478900B1228E31ULL, 0xE8B768EB18C8B8A2ULL, 0x2FA64AD7E2F6E317ULL, + 0x6D56AB3C4B1CD584ULL, 0xE374EF45BF6062EEULL, 0xA1840EAE168A547DULL, + 0x66952C92ECB40FC8ULL, 0x2465CD79455E395BULL, 0x3821458AADA7578FULL, + 0x7AD1A461044D611CULL, 0xBDC0865DFE733AA9ULL, 0xFF3067B657990C3AULL, + 0x711223CFA3E5BB50ULL, 0x33E2C2240A0F8DC3ULL, 0xF4F3E018F031D676ULL, + 0xB60301F359DBE0E5ULL, 0xDA050215EA6C212FULL, 0x98F5E3FE438617BCULL, + 0x5FE4C1C2B9B84C09ULL, 0x1D14202910527A9AULL, 0x93366450E42ECDF0ULL, + 0xD1C685BB4DC4FB63ULL, 0x16D7A787B7FAA0D6ULL, 0x5427466C1E109645ULL, + 0x4863CE9FF6E9F891ULL, 0x0A932F745F03CE02ULL, 0xCD820D48A53D95B7ULL, + 0x8F72ECA30CD7A324ULL, 0x0150A8DAF8AB144EULL, 0x43A04931514122DDULL, + 0x84B16B0DAB7F7968ULL, 0xC6418AE602954FFBULL, 0xBC387AEA7A8DA4C0ULL, + 0xFEC89B01D3679253ULL, 0x39D9B93D2959C9E6ULL, 0x7B2958D680B3FF75ULL, + 0xF50B1CAF74CF481FULL, 0xB7FBFD44DD257E8CULL, 0x70EADF78271B2539ULL, + 0x321A3E938EF113AAULL, 0x2E5EB66066087D7EULL, 0x6CAE578BCFE24BEDULL, + 0xABBF75B735DC1058ULL, 0xE94F945C9C3626CBULL, 0x676DD025684A91A1ULL, + 0x259D31CEC1A0A732ULL, 0xE28C13F23B9EFC87ULL, 0xA07CF2199274CA14ULL, + 0x167FF3EACBAF2AF1ULL, 0x548F120162451C62ULL, 0x939E303D987B47D7ULL, + 0xD16ED1D631917144ULL, 0x5F4C95AFC5EDC62EULL, 0x1DBC74446C07F0BDULL, + 0xDAAD56789639AB08ULL, 0x985DB7933FD39D9BULL, 0x84193F60D72AF34FULL, + 0xC6E9DE8B7EC0C5DCULL, 0x01F8FCB784FE9E69ULL, 0x43081D5C2D14A8FAULL, + 0xCD2A5925D9681F90ULL, 0x8FDAB8CE70822903ULL, 0x48CB9AF28ABC72B6ULL, + 0x0A3B7B1923564425ULL, 0x70428B155B4EAF1EULL, 0x32B26AFEF2A4998DULL, + 0xF5A348C2089AC238ULL, 0xB753A929A170F4ABULL, 0x3971ED50550C43C1ULL, + 0x7B810CBBFCE67552ULL, 0xBC902E8706D82EE7ULL, 0xFE60CF6CAF321874ULL, + 0xE224479F47CB76A0ULL, 0xA0D4A674EE214033ULL, 0x67C58448141F1B86ULL, + 0x253565A3BDF52D15ULL, 0xAB1721DA49899A7FULL, 0xE9E7C031E063ACECULL, + 0x2EF6E20D1A5DF759ULL, 0x6C0603E6B3B7C1CAULL, 0xF6FAE5C07D3274CDULL, + 0xB40A042BD4D8425EULL, 0x731B26172EE619EBULL, 0x31EBC7FC870C2F78ULL, + 0xBFC9838573709812ULL, 0xFD39626EDA9AAE81ULL, 0x3A28405220A4F534ULL, + 0x78D8A1B9894EC3A7ULL, 0x649C294A61B7AD73ULL, 0x266CC8A1C85D9BE0ULL, + 0xE17DEA9D3263C055ULL, 0xA38D0B769B89F6C6ULL, 0x2DAF4F0F6FF541ACULL, + 0x6F5FAEE4C61F773FULL, 0xA84E8CD83C212C8AULL, 0xEABE6D3395CB1A19ULL, + 0x90C79D3FEDD3F122ULL, 0xD2377CD44439C7B1ULL, 0x15265EE8BE079C04ULL, + 0x57D6BF0317EDAA97ULL, 0xD9F4FB7AE3911DFDULL, 0x9B041A914A7B2B6EULL, + 0x5C1538ADB04570DBULL, 0x1EE5D94619AF4648ULL, 0x02A151B5F156289CULL, + 0x4051B05E58BC1E0FULL, 0x87409262A28245BAULL, 0xC5B073890B687329ULL, + 0x4B9237F0FF14C443ULL, 0x0962D61B56FEF2D0ULL, 0xCE73F427ACC0A965ULL, + 0x8C8315CC052A9FF6ULL, 0x3A80143F5CF17F13ULL, 0x7870F5D4F51B4980ULL, + 0xBF61D7E80F251235ULL, 0xFD913603A6CF24A6ULL, 0x73B3727A52B393CCULL, + 0x31439391FB59A55FULL, 0xF652B1AD0167FEEAULL, 0xB4A25046A88DC879ULL, + 0xA8E6D8B54074A6ADULL, 0xEA16395EE99E903EULL, 0x2D071B6213A0CB8BULL, + 0x6FF7FA89BA4AFD18ULL, 0xE1D5BEF04E364A72ULL, 0xA3255F1BE7DC7CE1ULL, + 0x64347D271DE22754ULL, 0x26C49CCCB40811C7ULL, 0x5CBD6CC0CC10FAFCULL, + 0x1E4D8D2B65FACC6FULL, 0xD95CAF179FC497DAULL, 0x9BAC4EFC362EA149ULL, + 0x158E0A85C2521623ULL, 0x577EEB6E6BB820B0ULL, 0x906FC95291867B05ULL, + 0xD29F28B9386C4D96ULL, 0xCEDBA04AD0952342ULL, 0x8C2B41A1797F15D1ULL, + 0x4B3A639D83414E64ULL, 0x09CA82762AAB78F7ULL, 0x87E8C60FDED7CF9DULL, + 0xC51827E4773DF90EULL, 0x020905D88D03A2BBULL, 0x40F9E43324E99428ULL, + 0x2CFFE7D5975E55E2ULL, 0x6E0F063E3EB46371ULL, 0xA91E2402C48A38C4ULL, + 0xEBEEC5E96D600E57ULL, 0x65CC8190991CB93DULL, 0x273C607B30F68FAEULL, + 0xE02D4247CAC8D41BULL, 0xA2DDA3AC6322E288ULL, 0xBE992B5F8BDB8C5CULL, + 0xFC69CAB42231BACFULL, 0x3B78E888D80FE17AULL, 0x7988096371E5D7E9ULL, + 0xF7AA4D1A85996083ULL, 0xB55AACF12C735610ULL, 0x724B8ECDD64D0DA5ULL, + 0x30BB6F267FA73B36ULL, 0x4AC29F2A07BFD00DULL, 0x08327EC1AE55E69EULL, + 0xCF235CFD546BBD2BULL, 0x8DD3BD16FD818BB8ULL, 0x03F1F96F09FD3CD2ULL, + 0x41011884A0170A41ULL, 0x86103AB85A2951F4ULL, 0xC4E0DB53F3C36767ULL, + 0xD8A453A01B3A09B3ULL, 0x9A54B24BB2D03F20ULL, 0x5D45907748EE6495ULL, + 0x1FB5719CE1045206ULL, 0x919735E51578E56CULL, 0xD367D40EBC92D3FFULL, + 0x1476F63246AC884AULL, 0x568617D9EF46BED9ULL, 0xE085162AB69D5E3CULL, + 0xA275F7C11F7768AFULL, 0x6564D5FDE549331AULL, 0x279434164CA30589ULL, + 0xA9B6706FB8DFB2E3ULL, 0xEB46918411358470ULL, 0x2C57B3B8EB0BDFC5ULL, + 0x6EA7525342E1E956ULL, 0x72E3DAA0AA188782ULL, 0x30133B4B03F2B111ULL, + 0xF7021977F9CCEAA4ULL, 0xB5F2F89C5026DC37ULL, 0x3BD0BCE5A45A6B5DULL, + 0x79205D0E0DB05DCEULL, 0xBE317F32F78E067BULL, 0xFCC19ED95E6430E8ULL, + 0x86B86ED5267CDBD3ULL, 0xC4488F3E8F96ED40ULL, 0x0359AD0275A8B6F5ULL, + 0x41A94CE9DC428066ULL, 0xCF8B0890283E370CULL, 0x8D7BE97B81D4019FULL, + 0x4A6ACB477BEA5A2AULL, 0x089A2AACD2006CB9ULL, 0x14DEA25F3AF9026DULL, + 0x562E43B4931334FEULL, 0x913F6188692D6F4BULL, 0xD3CF8063C0C759D8ULL, + 0x5DEDC41A34BBEEB2ULL, 0x1F1D25F19D51D821ULL, 0xD80C07CD676F8394ULL, + 0x9AFCE626CE85B507ULL, +}; + +uint64_t bch_crc64_update(uint64_t crc, const void *_data, size_t len) +{ + const unsigned char *data = _data; + + while (len--) { + int i = ((int) (crc >> 56) ^ *data++) & 0xFF; + crc = crc_table[i] ^ (crc << 8); + } + + return crc; +} + +uint64_t bch_crc64(const void *data, size_t len) +{ + uint64_t crc = 0xffffffffffffffffULL; + + crc = bch_crc64_update(crc, data, len); + + return crc ^ 0xffffffffffffffffULL; +} diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h new file mode 100644 index 000000000000..577393e38c3a --- /dev/null +++ b/drivers/md/bcache/util.h @@ -0,0 +1,589 @@ + +#ifndef _BCACHE_UTIL_H +#define _BCACHE_UTIL_H + +#include +#include +#include +#include +#include +#include + +#include "closure.h" + +#define PAGE_SECTORS (PAGE_SIZE / 512) + +struct closure; + +#include + +#ifdef CONFIG_BCACHE_EDEBUG + +#define atomic_dec_bug(v) BUG_ON(atomic_dec_return(v) < 0) +#define atomic_inc_bug(v, i) BUG_ON(atomic_inc_return(v) <= i) + +#else /* EDEBUG */ + +#define atomic_dec_bug(v) atomic_dec(v) +#define atomic_inc_bug(v, i) atomic_inc(v) + +#endif + +#define BITMASK(name, type, field, offset, size) \ +static inline uint64_t name(const type *k) \ +{ return (k->field >> offset) & ~(((uint64_t) ~0) << size); } \ + \ +static inline void SET_##name(type *k, uint64_t v) \ +{ \ + k->field &= ~(~((uint64_t) ~0 << size) << offset); \ + k->field |= v << offset; \ +} + +#define DECLARE_HEAP(type, name) \ + struct { \ + size_t size, used; \ + type *data; \ + } name + +#define init_heap(heap, _size, gfp) \ +({ \ + size_t _bytes; \ + (heap)->used = 0; \ + (heap)->size = (_size); \ + _bytes = (heap)->size * sizeof(*(heap)->data); \ + (heap)->data = NULL; \ + if (_bytes < KMALLOC_MAX_SIZE) \ + (heap)->data = kmalloc(_bytes, (gfp)); \ + if ((!(heap)->data) && ((gfp) & GFP_KERNEL)) \ + (heap)->data = vmalloc(_bytes); \ + (heap)->data; \ +}) + +#define free_heap(heap) \ +do { \ + if (is_vmalloc_addr((heap)->data)) \ + vfree((heap)->data); \ + else \ + kfree((heap)->data); \ + (heap)->data = NULL; \ +} while (0) + +#define heap_swap(h, i, j) swap((h)->data[i], (h)->data[j]) + +#define heap_sift(h, i, cmp) \ +do { \ + size_t _r, _j = i; \ + \ + for (; _j * 2 + 1 < (h)->used; _j = _r) { \ + _r = _j * 2 + 1; \ + if (_r + 1 < (h)->used && \ + cmp((h)->data[_r], (h)->data[_r + 1])) \ + _r++; \ + \ + if (cmp((h)->data[_r], (h)->data[_j])) \ + break; \ + heap_swap(h, _r, _j); \ + } \ +} while (0) + +#define heap_sift_down(h, i, cmp) \ +do { \ + while (i) { \ + size_t p = (i - 1) / 2; \ + if (cmp((h)->data[i], (h)->data[p])) \ + break; \ + heap_swap(h, i, p); \ + i = p; \ + } \ +} while (0) + +#define heap_add(h, d, cmp) \ +({ \ + bool _r = !heap_full(h); \ + if (_r) { \ + size_t _i = (h)->used++; \ + (h)->data[_i] = d; \ + \ + heap_sift_down(h, _i, cmp); \ + heap_sift(h, _i, cmp); \ + } \ + _r; \ +}) + +#define heap_pop(h, d, cmp) \ +({ \ + bool _r = (h)->used; \ + if (_r) { \ + (d) = (h)->data[0]; \ + (h)->used--; \ + heap_swap(h, 0, (h)->used); \ + heap_sift(h, 0, cmp); \ + } \ + _r; \ +}) + +#define heap_peek(h) ((h)->size ? (h)->data[0] : NULL) + +#define heap_full(h) ((h)->used == (h)->size) + +#define DECLARE_FIFO(type, name) \ + struct { \ + size_t front, back, size, mask; \ + type *data; \ + } name + +#define fifo_for_each(c, fifo, iter) \ + for (iter = (fifo)->front; \ + c = (fifo)->data[iter], iter != (fifo)->back; \ + iter = (iter + 1) & (fifo)->mask) + +#define __init_fifo(fifo, gfp) \ +({ \ + size_t _allocated_size, _bytes; \ + BUG_ON(!(fifo)->size); \ + \ + _allocated_size = roundup_pow_of_two((fifo)->size + 1); \ + _bytes = _allocated_size * sizeof(*(fifo)->data); \ + \ + (fifo)->mask = _allocated_size - 1; \ + (fifo)->front = (fifo)->back = 0; \ + (fifo)->data = NULL; \ + \ + if (_bytes < KMALLOC_MAX_SIZE) \ + (fifo)->data = kmalloc(_bytes, (gfp)); \ + if ((!(fifo)->data) && ((gfp) & GFP_KERNEL)) \ + (fifo)->data = vmalloc(_bytes); \ + (fifo)->data; \ +}) + +#define init_fifo_exact(fifo, _size, gfp) \ +({ \ + (fifo)->size = (_size); \ + __init_fifo(fifo, gfp); \ +}) + +#define init_fifo(fifo, _size, gfp) \ +({ \ + (fifo)->size = (_size); \ + if ((fifo)->size > 4) \ + (fifo)->size = roundup_pow_of_two((fifo)->size) - 1; \ + __init_fifo(fifo, gfp); \ +}) + +#define free_fifo(fifo) \ +do { \ + if (is_vmalloc_addr((fifo)->data)) \ + vfree((fifo)->data); \ + else \ + kfree((fifo)->data); \ + (fifo)->data = NULL; \ +} while (0) + +#define fifo_used(fifo) (((fifo)->back - (fifo)->front) & (fifo)->mask) +#define fifo_free(fifo) ((fifo)->size - fifo_used(fifo)) + +#define fifo_empty(fifo) (!fifo_used(fifo)) +#define fifo_full(fifo) (!fifo_free(fifo)) + +#define fifo_front(fifo) ((fifo)->data[(fifo)->front]) +#define fifo_back(fifo) \ + ((fifo)->data[((fifo)->back - 1) & (fifo)->mask]) + +#define fifo_idx(fifo, p) (((p) - &fifo_front(fifo)) & (fifo)->mask) + +#define fifo_push_back(fifo, i) \ +({ \ + bool _r = !fifo_full((fifo)); \ + if (_r) { \ + (fifo)->data[(fifo)->back++] = (i); \ + (fifo)->back &= (fifo)->mask; \ + } \ + _r; \ +}) + +#define fifo_pop_front(fifo, i) \ +({ \ + bool _r = !fifo_empty((fifo)); \ + if (_r) { \ + (i) = (fifo)->data[(fifo)->front++]; \ + (fifo)->front &= (fifo)->mask; \ + } \ + _r; \ +}) + +#define fifo_push_front(fifo, i) \ +({ \ + bool _r = !fifo_full((fifo)); \ + if (_r) { \ + --(fifo)->front; \ + (fifo)->front &= (fifo)->mask; \ + (fifo)->data[(fifo)->front] = (i); \ + } \ + _r; \ +}) + +#define fifo_pop_back(fifo, i) \ +({ \ + bool _r = !fifo_empty((fifo)); \ + if (_r) { \ + --(fifo)->back; \ + (fifo)->back &= (fifo)->mask; \ + (i) = (fifo)->data[(fifo)->back] \ + } \ + _r; \ +}) + +#define fifo_push(fifo, i) fifo_push_back(fifo, (i)) +#define fifo_pop(fifo, i) fifo_pop_front(fifo, (i)) + +#define fifo_swap(l, r) \ +do { \ + swap((l)->front, (r)->front); \ + swap((l)->back, (r)->back); \ + swap((l)->size, (r)->size); \ + swap((l)->mask, (r)->mask); \ + swap((l)->data, (r)->data); \ +} while (0) + +#define fifo_move(dest, src) \ +do { \ + typeof(*((dest)->data)) _t; \ + while (!fifo_full(dest) && \ + fifo_pop(src, _t)) \ + fifo_push(dest, _t); \ +} while (0) + +/* + * Simple array based allocator - preallocates a number of elements and you can + * never allocate more than that, also has no locking. + * + * Handy because if you know you only need a fixed number of elements you don't + * have to worry about memory allocation failure, and sometimes a mempool isn't + * what you want. + * + * We treat the free elements as entries in a singly linked list, and the + * freelist as a stack - allocating and freeing push and pop off the freelist. + */ + +#define DECLARE_ARRAY_ALLOCATOR(type, name, size) \ + struct { \ + type *freelist; \ + type data[size]; \ + } name + +#define array_alloc(array) \ +({ \ + typeof((array)->freelist) _ret = (array)->freelist; \ + \ + if (_ret) \ + (array)->freelist = *((typeof((array)->freelist) *) _ret);\ + \ + _ret; \ +}) + +#define array_free(array, ptr) \ +do { \ + typeof((array)->freelist) _ptr = ptr; \ + \ + *((typeof((array)->freelist) *) _ptr) = (array)->freelist; \ + (array)->freelist = _ptr; \ +} while (0) + +#define array_allocator_init(array) \ +do { \ + typeof((array)->freelist) _i; \ + \ + BUILD_BUG_ON(sizeof((array)->data[0]) < sizeof(void *)); \ + (array)->freelist = NULL; \ + \ + for (_i = (array)->data; \ + _i < (array)->data + ARRAY_SIZE((array)->data); \ + _i++) \ + array_free(array, _i); \ +} while (0) + +#define array_freelist_empty(array) ((array)->freelist == NULL) + +#define ANYSINT_MAX(t) \ + ((((t) 1 << (sizeof(t) * 8 - 2)) - (t) 1) * (t) 2 + (t) 1) + +int bch_strtoint_h(const char *, int *); +int bch_strtouint_h(const char *, unsigned int *); +int bch_strtoll_h(const char *, long long *); +int bch_strtoull_h(const char *, unsigned long long *); + +static inline int bch_strtol_h(const char *cp, long *res) +{ +#if BITS_PER_LONG == 32 + return bch_strtoint_h(cp, (int *) res); +#else + return bch_strtoll_h(cp, (long long *) res); +#endif +} + +static inline int bch_strtoul_h(const char *cp, long *res) +{ +#if BITS_PER_LONG == 32 + return bch_strtouint_h(cp, (unsigned int *) res); +#else + return bch_strtoull_h(cp, (unsigned long long *) res); +#endif +} + +#define strtoi_h(cp, res) \ + (__builtin_types_compatible_p(typeof(*res), int) \ + ? bch_strtoint_h(cp, (void *) res) \ + : __builtin_types_compatible_p(typeof(*res), long) \ + ? bch_strtol_h(cp, (void *) res) \ + : __builtin_types_compatible_p(typeof(*res), long long) \ + ? bch_strtoll_h(cp, (void *) res) \ + : __builtin_types_compatible_p(typeof(*res), unsigned int) \ + ? bch_strtouint_h(cp, (void *) res) \ + : __builtin_types_compatible_p(typeof(*res), unsigned long) \ + ? bch_strtoul_h(cp, (void *) res) \ + : __builtin_types_compatible_p(typeof(*res), unsigned long long)\ + ? bch_strtoull_h(cp, (void *) res) : -EINVAL) + +#define strtoul_safe(cp, var) \ +({ \ + unsigned long _v; \ + int _r = kstrtoul(cp, 10, &_v); \ + if (!_r) \ + var = _v; \ + _r; \ +}) + +#define strtoul_safe_clamp(cp, var, min, max) \ +({ \ + unsigned long _v; \ + int _r = kstrtoul(cp, 10, &_v); \ + if (!_r) \ + var = clamp_t(typeof(var), _v, min, max); \ + _r; \ +}) + +#define snprint(buf, size, var) \ + snprintf(buf, size, \ + __builtin_types_compatible_p(typeof(var), int) \ + ? "%i\n" : \ + __builtin_types_compatible_p(typeof(var), unsigned) \ + ? "%u\n" : \ + __builtin_types_compatible_p(typeof(var), long) \ + ? "%li\n" : \ + __builtin_types_compatible_p(typeof(var), unsigned long)\ + ? "%lu\n" : \ + __builtin_types_compatible_p(typeof(var), int64_t) \ + ? "%lli\n" : \ + __builtin_types_compatible_p(typeof(var), uint64_t) \ + ? "%llu\n" : \ + __builtin_types_compatible_p(typeof(var), const char *) \ + ? "%s\n" : "%i\n", var) + +ssize_t bch_hprint(char *buf, int64_t v); + +bool bch_is_zero(const char *p, size_t n); +int bch_parse_uuid(const char *s, char *uuid); + +ssize_t bch_snprint_string_list(char *buf, size_t size, const char * const list[], + size_t selected); + +ssize_t bch_read_string_list(const char *buf, const char * const list[]); + +struct time_stats { + /* + * all fields are in nanoseconds, averages are ewmas stored left shifted + * by 8 + */ + uint64_t max_duration; + uint64_t average_duration; + uint64_t average_frequency; + uint64_t last; +}; + +void bch_time_stats_update(struct time_stats *stats, uint64_t time); + +#define NSEC_PER_ns 1L +#define NSEC_PER_us NSEC_PER_USEC +#define NSEC_PER_ms NSEC_PER_MSEC +#define NSEC_PER_sec NSEC_PER_SEC + +#define __print_time_stat(stats, name, stat, units) \ + sysfs_print(name ## _ ## stat ## _ ## units, \ + div_u64((stats)->stat >> 8, NSEC_PER_ ## units)) + +#define sysfs_print_time_stats(stats, name, \ + frequency_units, \ + duration_units) \ +do { \ + __print_time_stat(stats, name, \ + average_frequency, frequency_units); \ + __print_time_stat(stats, name, \ + average_duration, duration_units); \ + __print_time_stat(stats, name, \ + max_duration, duration_units); \ + \ + sysfs_print(name ## _last_ ## frequency_units, (stats)->last \ + ? div_s64(local_clock() - (stats)->last, \ + NSEC_PER_ ## frequency_units) \ + : -1LL); \ +} while (0) + +#define sysfs_time_stats_attribute(name, \ + frequency_units, \ + duration_units) \ +read_attribute(name ## _average_frequency_ ## frequency_units); \ +read_attribute(name ## _average_duration_ ## duration_units); \ +read_attribute(name ## _max_duration_ ## duration_units); \ +read_attribute(name ## _last_ ## frequency_units) + +#define sysfs_time_stats_attribute_list(name, \ + frequency_units, \ + duration_units) \ +&sysfs_ ## name ## _average_frequency_ ## frequency_units, \ +&sysfs_ ## name ## _average_duration_ ## duration_units, \ +&sysfs_ ## name ## _max_duration_ ## duration_units, \ +&sysfs_ ## name ## _last_ ## frequency_units, + +#define ewma_add(ewma, val, weight, factor) \ +({ \ + (ewma) *= (weight) - 1; \ + (ewma) += (val) << factor; \ + (ewma) /= (weight); \ + (ewma) >> factor; \ +}) + +struct ratelimit { + uint64_t next; + unsigned rate; +}; + +static inline void ratelimit_reset(struct ratelimit *d) +{ + d->next = local_clock(); +} + +unsigned bch_next_delay(struct ratelimit *d, uint64_t done); + +#define __DIV_SAFE(n, d, zero) \ +({ \ + typeof(n) _n = (n); \ + typeof(d) _d = (d); \ + _d ? _n / _d : zero; \ +}) + +#define DIV_SAFE(n, d) __DIV_SAFE(n, d, 0) + +#define container_of_or_null(ptr, type, member) \ +({ \ + typeof(ptr) _ptr = ptr; \ + _ptr ? container_of(_ptr, type, member) : NULL; \ +}) + +#define RB_INSERT(root, new, member, cmp) \ +({ \ + __label__ dup; \ + struct rb_node **n = &(root)->rb_node, *parent = NULL; \ + typeof(new) this; \ + int res, ret = -1; \ + \ + while (*n) { \ + parent = *n; \ + this = container_of(*n, typeof(*(new)), member); \ + res = cmp(new, this); \ + if (!res) \ + goto dup; \ + n = res < 0 \ + ? &(*n)->rb_left \ + : &(*n)->rb_right; \ + } \ + \ + rb_link_node(&(new)->member, parent, n); \ + rb_insert_color(&(new)->member, root); \ + ret = 0; \ +dup: \ + ret; \ +}) + +#define RB_SEARCH(root, search, member, cmp) \ +({ \ + struct rb_node *n = (root)->rb_node; \ + typeof(&(search)) this, ret = NULL; \ + int res; \ + \ + while (n) { \ + this = container_of(n, typeof(search), member); \ + res = cmp(&(search), this); \ + if (!res) { \ + ret = this; \ + break; \ + } \ + n = res < 0 \ + ? n->rb_left \ + : n->rb_right; \ + } \ + ret; \ +}) + +#define RB_GREATER(root, search, member, cmp) \ +({ \ + struct rb_node *n = (root)->rb_node; \ + typeof(&(search)) this, ret = NULL; \ + int res; \ + \ + while (n) { \ + this = container_of(n, typeof(search), member); \ + res = cmp(&(search), this); \ + if (res < 0) { \ + ret = this; \ + n = n->rb_left; \ + } else \ + n = n->rb_right; \ + } \ + ret; \ +}) + +#define RB_FIRST(root, type, member) \ + container_of_or_null(rb_first(root), type, member) + +#define RB_LAST(root, type, member) \ + container_of_or_null(rb_last(root), type, member) + +#define RB_NEXT(ptr, member) \ + container_of_or_null(rb_next(&(ptr)->member), typeof(*ptr), member) + +#define RB_PREV(ptr, member) \ + container_of_or_null(rb_prev(&(ptr)->member), typeof(*ptr), member) + +/* Does linear interpolation between powers of two */ +static inline unsigned fract_exp_two(unsigned x, unsigned fract_bits) +{ + unsigned fract = x & ~(~0 << fract_bits); + + x >>= fract_bits; + x = 1 << x; + x += (x * fract) >> fract_bits; + + return x; +} + +#define bio_end(bio) ((bio)->bi_sector + bio_sectors(bio)) + +void bch_bio_map(struct bio *bio, void *base); + +int bch_bio_alloc_pages(struct bio *bio, gfp_t gfp); + +static inline sector_t bdev_sectors(struct block_device *bdev) +{ + return bdev->bd_inode->i_size >> 9; +} + +#define closure_bio_submit(bio, cl, dev) \ +do { \ + closure_get(cl); \ + bch_generic_make_request(bio, &(dev)->bio_split_hook); \ +} while (0) + +uint64_t bch_crc64_update(uint64_t, const void *, size_t); +uint64_t bch_crc64(const void *, size_t); + +#endif /* _BCACHE_UTIL_H */ diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c new file mode 100644 index 000000000000..93e7e31a4bd3 --- /dev/null +++ b/drivers/md/bcache/writeback.c @@ -0,0 +1,414 @@ +/* + * background writeback - scan btree for dirty data and write it to the backing + * device + * + * Copyright 2010, 2011 Kent Overstreet + * Copyright 2012 Google, Inc. + */ + +#include "bcache.h" +#include "btree.h" +#include "debug.h" + +static struct workqueue_struct *dirty_wq; + +static void read_dirty(struct closure *); + +struct dirty_io { + struct closure cl; + struct cached_dev *dc; + struct bio bio; +}; + +/* Rate limiting */ + +static void __update_writeback_rate(struct cached_dev *dc) +{ + struct cache_set *c = dc->disk.c; + uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size; + uint64_t cache_dirty_target = + div_u64(cache_sectors * dc->writeback_percent, 100); + + int64_t target = div64_u64(cache_dirty_target * bdev_sectors(dc->bdev), + c->cached_dev_sectors); + + /* PD controller */ + + int change = 0; + int64_t error; + int64_t dirty = atomic_long_read(&dc->disk.sectors_dirty); + int64_t derivative = dirty - dc->disk.sectors_dirty_last; + + dc->disk.sectors_dirty_last = dirty; + + derivative *= dc->writeback_rate_d_term; + derivative = clamp(derivative, -dirty, dirty); + + derivative = ewma_add(dc->disk.sectors_dirty_derivative, derivative, + dc->writeback_rate_d_smooth, 0); + + /* Avoid divide by zero */ + if (!target) + goto out; + + error = div64_s64((dirty + derivative - target) << 8, target); + + change = div_s64((dc->writeback_rate.rate * error) >> 8, + dc->writeback_rate_p_term_inverse); + + /* Don't increase writeback rate if the device isn't keeping up */ + if (change > 0 && + time_after64(local_clock(), + dc->writeback_rate.next + 10 * NSEC_PER_MSEC)) + change = 0; + + dc->writeback_rate.rate = + clamp_t(int64_t, dc->writeback_rate.rate + change, + 1, NSEC_PER_MSEC); +out: + dc->writeback_rate_derivative = derivative; + dc->writeback_rate_change = change; + dc->writeback_rate_target = target; + + schedule_delayed_work(&dc->writeback_rate_update, + dc->writeback_rate_update_seconds * HZ); +} + +static void update_writeback_rate(struct work_struct *work) +{ + struct cached_dev *dc = container_of(to_delayed_work(work), + struct cached_dev, + writeback_rate_update); + + down_read(&dc->writeback_lock); + + if (atomic_read(&dc->has_dirty) && + dc->writeback_percent) + __update_writeback_rate(dc); + + up_read(&dc->writeback_lock); +} + +static unsigned writeback_delay(struct cached_dev *dc, unsigned sectors) +{ + if (atomic_read(&dc->disk.detaching) || + !dc->writeback_percent) + return 0; + + return bch_next_delay(&dc->writeback_rate, sectors * 10000000ULL); +} + +/* Background writeback */ + +static bool dirty_pred(struct keybuf *buf, struct bkey *k) +{ + return KEY_DIRTY(k); +} + +static void dirty_init(struct keybuf_key *w) +{ + struct dirty_io *io = w->private; + struct bio *bio = &io->bio; + + bio_init(bio); + if (!io->dc->writeback_percent) + bio_set_prio(bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0)); + + bio->bi_size = KEY_SIZE(&w->key) << 9; + bio->bi_max_vecs = DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS); + bio->bi_private = w; + bio->bi_io_vec = bio->bi_inline_vecs; + bch_bio_map(bio, NULL); +} + +static void refill_dirty(struct closure *cl) +{ + struct cached_dev *dc = container_of(cl, struct cached_dev, + writeback.cl); + struct keybuf *buf = &dc->writeback_keys; + bool searched_from_start = false; + struct bkey end = MAX_KEY; + SET_KEY_INODE(&end, dc->disk.id); + + if (!atomic_read(&dc->disk.detaching) && + !dc->writeback_running) + closure_return(cl); + + down_write(&dc->writeback_lock); + + if (!atomic_read(&dc->has_dirty)) { + SET_BDEV_STATE(&dc->sb, BDEV_STATE_CLEAN); + bch_write_bdev_super(dc, NULL); + + up_write(&dc->writeback_lock); + closure_return(cl); + } + + if (bkey_cmp(&buf->last_scanned, &end) >= 0) { + buf->last_scanned = KEY(dc->disk.id, 0, 0); + searched_from_start = true; + } + + bch_refill_keybuf(dc->disk.c, buf, &end); + + if (bkey_cmp(&buf->last_scanned, &end) >= 0 && searched_from_start) { + /* Searched the entire btree - delay awhile */ + + if (RB_EMPTY_ROOT(&buf->keys)) { + atomic_set(&dc->has_dirty, 0); + cached_dev_put(dc); + } + + if (!atomic_read(&dc->disk.detaching)) + closure_delay(&dc->writeback, dc->writeback_delay * HZ); + } + + up_write(&dc->writeback_lock); + + ratelimit_reset(&dc->writeback_rate); + + /* Punt to workqueue only so we don't recurse and blow the stack */ + continue_at(cl, read_dirty, dirty_wq); +} + +void bch_writeback_queue(struct cached_dev *dc) +{ + if (closure_trylock(&dc->writeback.cl, &dc->disk.cl)) { + if (!atomic_read(&dc->disk.detaching)) + closure_delay(&dc->writeback, dc->writeback_delay * HZ); + + continue_at(&dc->writeback.cl, refill_dirty, dirty_wq); + } +} + +void bch_writeback_add(struct cached_dev *dc, unsigned sectors) +{ + atomic_long_add(sectors, &dc->disk.sectors_dirty); + + if (!atomic_read(&dc->has_dirty) && + !atomic_xchg(&dc->has_dirty, 1)) { + atomic_inc(&dc->count); + + if (BDEV_STATE(&dc->sb) != BDEV_STATE_DIRTY) { + SET_BDEV_STATE(&dc->sb, BDEV_STATE_DIRTY); + /* XXX: should do this synchronously */ + bch_write_bdev_super(dc, NULL); + } + + bch_writeback_queue(dc); + + if (dc->writeback_percent) + schedule_delayed_work(&dc->writeback_rate_update, + dc->writeback_rate_update_seconds * HZ); + } +} + +/* Background writeback - IO loop */ + +static void dirty_io_destructor(struct closure *cl) +{ + struct dirty_io *io = container_of(cl, struct dirty_io, cl); + kfree(io); +} + +static void write_dirty_finish(struct closure *cl) +{ + struct dirty_io *io = container_of(cl, struct dirty_io, cl); + struct keybuf_key *w = io->bio.bi_private; + struct cached_dev *dc = io->dc; + struct bio_vec *bv = bio_iovec_idx(&io->bio, io->bio.bi_vcnt); + + while (bv-- != io->bio.bi_io_vec) + __free_page(bv->bv_page); + + /* This is kind of a dumb way of signalling errors. */ + if (KEY_DIRTY(&w->key)) { + unsigned i; + struct btree_op op; + bch_btree_op_init_stack(&op); + + op.type = BTREE_REPLACE; + bkey_copy(&op.replace, &w->key); + + SET_KEY_DIRTY(&w->key, false); + bch_keylist_add(&op.keys, &w->key); + + for (i = 0; i < KEY_PTRS(&w->key); i++) + atomic_inc(&PTR_BUCKET(dc->disk.c, &w->key, i)->pin); + + pr_debug("clearing %s", pkey(&w->key)); + bch_btree_insert(&op, dc->disk.c); + closure_sync(&op.cl); + + atomic_long_inc(op.insert_collision + ? &dc->disk.c->writeback_keys_failed + : &dc->disk.c->writeback_keys_done); + } + + bch_keybuf_del(&dc->writeback_keys, w); + atomic_dec_bug(&dc->in_flight); + + closure_wake_up(&dc->writeback_wait); + + closure_return_with_destructor(cl, dirty_io_destructor); +} + +static void dirty_endio(struct bio *bio, int error) +{ + struct keybuf_key *w = bio->bi_private; + struct dirty_io *io = w->private; + + if (error) + SET_KEY_DIRTY(&w->key, false); + + closure_put(&io->cl); +} + +static void write_dirty(struct closure *cl) +{ + struct dirty_io *io = container_of(cl, struct dirty_io, cl); + struct keybuf_key *w = io->bio.bi_private; + + dirty_init(w); + io->bio.bi_rw = WRITE; + io->bio.bi_sector = KEY_START(&w->key); + io->bio.bi_bdev = io->dc->bdev; + io->bio.bi_end_io = dirty_endio; + + trace_bcache_write_dirty(&io->bio); + closure_bio_submit(&io->bio, cl, &io->dc->disk); + + continue_at(cl, write_dirty_finish, dirty_wq); +} + +static void read_dirty_endio(struct bio *bio, int error) +{ + struct keybuf_key *w = bio->bi_private; + struct dirty_io *io = w->private; + + bch_count_io_errors(PTR_CACHE(io->dc->disk.c, &w->key, 0), + error, "reading dirty data from cache"); + + dirty_endio(bio, error); +} + +static void read_dirty_submit(struct closure *cl) +{ + struct dirty_io *io = container_of(cl, struct dirty_io, cl); + + trace_bcache_read_dirty(&io->bio); + closure_bio_submit(&io->bio, cl, &io->dc->disk); + + continue_at(cl, write_dirty, dirty_wq); +} + +static void read_dirty(struct closure *cl) +{ + struct cached_dev *dc = container_of(cl, struct cached_dev, + writeback.cl); + unsigned delay = writeback_delay(dc, 0); + struct keybuf_key *w; + struct dirty_io *io; + + /* + * XXX: if we error, background writeback just spins. Should use some + * mempools. + */ + + while (1) { + w = bch_keybuf_next(&dc->writeback_keys); + if (!w) + break; + + BUG_ON(ptr_stale(dc->disk.c, &w->key, 0)); + + if (delay > 0 && + (KEY_START(&w->key) != dc->last_read || + jiffies_to_msecs(delay) > 50)) { + w->private = NULL; + + closure_delay(&dc->writeback, delay); + continue_at(cl, read_dirty, dirty_wq); + } + + dc->last_read = KEY_OFFSET(&w->key); + + io = kzalloc(sizeof(struct dirty_io) + sizeof(struct bio_vec) + * DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS), + GFP_KERNEL); + if (!io) + goto err; + + w->private = io; + io->dc = dc; + + dirty_init(w); + io->bio.bi_sector = PTR_OFFSET(&w->key, 0); + io->bio.bi_bdev = PTR_CACHE(dc->disk.c, + &w->key, 0)->bdev; + io->bio.bi_rw = READ; + io->bio.bi_end_io = read_dirty_endio; + + if (bch_bio_alloc_pages(&io->bio, GFP_KERNEL)) + goto err_free; + + pr_debug("%s", pkey(&w->key)); + + closure_call(&io->cl, read_dirty_submit, NULL, &dc->disk.cl); + + delay = writeback_delay(dc, KEY_SIZE(&w->key)); + + atomic_inc(&dc->in_flight); + + if (!closure_wait_event(&dc->writeback_wait, cl, + atomic_read(&dc->in_flight) < 64)) + continue_at(cl, read_dirty, dirty_wq); + } + + if (0) { +err_free: + kfree(w->private); +err: + bch_keybuf_del(&dc->writeback_keys, w); + } + + refill_dirty(cl); +} + +void bch_writeback_init_cached_dev(struct cached_dev *dc) +{ + closure_init_unlocked(&dc->writeback); + init_rwsem(&dc->writeback_lock); + + bch_keybuf_init(&dc->writeback_keys, dirty_pred); + + dc->writeback_metadata = true; + dc->writeback_running = true; + dc->writeback_percent = 10; + dc->writeback_delay = 30; + dc->writeback_rate.rate = 1024; + + dc->writeback_rate_update_seconds = 30; + dc->writeback_rate_d_term = 16; + dc->writeback_rate_p_term_inverse = 64; + dc->writeback_rate_d_smooth = 8; + + INIT_DELAYED_WORK(&dc->writeback_rate_update, update_writeback_rate); + schedule_delayed_work(&dc->writeback_rate_update, + dc->writeback_rate_update_seconds * HZ); +} + +void bch_writeback_exit(void) +{ + if (dirty_wq) + destroy_workqueue(dirty_wq); +} + +int __init bch_writeback_init(void) +{ + dirty_wq = create_singlethread_workqueue("bcache_writeback"); + if (!dirty_wq) + return -ENOMEM; + + return 0; +} diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index c6083132c4b8..0387e05cdb98 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -319,6 +319,9 @@ static void __cache_size_refresh(void) static void *alloc_buffer_data(struct dm_bufio_client *c, gfp_t gfp_mask, enum data_mode *data_mode) { + unsigned noio_flag; + void *ptr; + if (c->block_size <= DM_BUFIO_BLOCK_SIZE_SLAB_LIMIT) { *data_mode = DATA_MODE_SLAB; return kmem_cache_alloc(DM_BUFIO_CACHE(c), gfp_mask); @@ -332,7 +335,26 @@ static void *alloc_buffer_data(struct dm_bufio_client *c, gfp_t gfp_mask, } *data_mode = DATA_MODE_VMALLOC; - return __vmalloc(c->block_size, gfp_mask, PAGE_KERNEL); + + /* + * __vmalloc allocates the data pages and auxiliary structures with + * gfp_flags that were specified, but pagetables are always allocated + * with GFP_KERNEL, no matter what was specified as gfp_mask. + * + * Consequently, we must set per-process flag PF_MEMALLOC_NOIO so that + * all allocations done by this process (including pagetables) are done + * as if GFP_NOIO was specified. + */ + + if (gfp_mask & __GFP_NORETRY) + noio_flag = memalloc_noio_save(); + + ptr = __vmalloc(c->block_size, gfp_mask, PAGE_KERNEL); + + if (gfp_mask & __GFP_NORETRY) + memalloc_noio_restore(noio_flag); + + return ptr; } /* diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c index 83e995fece88..1af7255bbffb 100644 --- a/drivers/md/dm-cache-metadata.c +++ b/drivers/md/dm-cache-metadata.c @@ -1044,7 +1044,7 @@ void dm_cache_metadata_get_stats(struct dm_cache_metadata *cmd, struct dm_cache_statistics *stats) { down_read(&cmd->root_lock); - memcpy(stats, &cmd->stats, sizeof(*stats)); + *stats = cmd->stats; up_read(&cmd->root_lock); } @@ -1052,7 +1052,7 @@ void dm_cache_metadata_set_stats(struct dm_cache_metadata *cmd, struct dm_cache_statistics *stats) { down_write(&cmd->root_lock); - memcpy(&cmd->stats, stats, sizeof(*stats)); + cmd->stats = *stats; up_write(&cmd->root_lock); } diff --git a/drivers/md/dm-cache-policy.h b/drivers/md/dm-cache-policy.h index 558bdfdabf5f..33369ca9614f 100644 --- a/drivers/md/dm-cache-policy.h +++ b/drivers/md/dm-cache-policy.h @@ -130,8 +130,8 @@ struct dm_cache_policy { * * Must not block. * - * Returns 1 iff in cache, 0 iff not, < 0 on error (-EWOULDBLOCK - * would be typical). + * Returns 0 if in cache, -ENOENT if not, < 0 for other errors + * (-EWOULDBLOCK would be typical). */ int (*lookup)(struct dm_cache_policy *p, dm_oblock_t oblock, dm_cblock_t *cblock); diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c index 10744091e6ca..df44b60e66f2 100644 --- a/drivers/md/dm-cache-target.c +++ b/drivers/md/dm-cache-target.c @@ -205,7 +205,7 @@ struct per_bio_data { /* * writethrough fields. These MUST remain at the end of this * structure and the 'cache' member must be the first as it - * is used to determine the offsetof the writethrough fields. + * is used to determine the offset of the writethrough fields. */ struct cache *cache; dm_cblock_t cblock; @@ -393,7 +393,7 @@ static int get_cell(struct cache *cache, return r; } - /*----------------------------------------------------------------*/ +/*----------------------------------------------------------------*/ static bool is_dirty(struct cache *cache, dm_cblock_t b) { @@ -419,6 +419,7 @@ static void clear_dirty(struct cache *cache, dm_oblock_t oblock, dm_cblock_t cbl } /*----------------------------------------------------------------*/ + static bool block_size_is_power_of_two(struct cache *cache) { return cache->sectors_per_block_shift >= 0; @@ -667,7 +668,7 @@ static void writethrough_endio(struct bio *bio, int err) /* * We can't issue this bio directly, since we're in interrupt - * context. So it get's put on a bio list for processing by the + * context. So it gets put on a bio list for processing by the * worker thread. */ defer_writethrough_bio(pb->cache, bio); @@ -1445,6 +1446,7 @@ static void do_worker(struct work_struct *ws) static void do_waker(struct work_struct *ws) { struct cache *cache = container_of(to_delayed_work(ws), struct cache, waker); + policy_tick(cache->policy); wake_worker(cache); queue_delayed_work(cache->wq, &cache->waker, COMMIT_PERIOD); } @@ -1809,7 +1811,37 @@ static int parse_cache_args(struct cache_args *ca, int argc, char **argv, static struct kmem_cache *migration_cache; -static int set_config_values(struct dm_cache_policy *p, int argc, const char **argv) +#define NOT_CORE_OPTION 1 + +static int process_config_option(struct cache *cache, const char *key, const char *value) +{ + unsigned long tmp; + + if (!strcasecmp(key, "migration_threshold")) { + if (kstrtoul(value, 10, &tmp)) + return -EINVAL; + + cache->migration_threshold = tmp; + return 0; + } + + return NOT_CORE_OPTION; +} + +static int set_config_value(struct cache *cache, const char *key, const char *value) +{ + int r = process_config_option(cache, key, value); + + if (r == NOT_CORE_OPTION) + r = policy_set_config_value(cache->policy, key, value); + + if (r) + DMWARN("bad config value for %s: %s", key, value); + + return r; +} + +static int set_config_values(struct cache *cache, int argc, const char **argv) { int r = 0; @@ -1819,12 +1851,9 @@ static int set_config_values(struct dm_cache_policy *p, int argc, const char **a } while (argc) { - r = policy_set_config_value(p, argv[0], argv[1]); - if (r) { - DMWARN("policy_set_config_value failed: key = '%s', value = '%s'", - argv[0], argv[1]); - return r; - } + r = set_config_value(cache, argv[0], argv[1]); + if (r) + break; argc -= 2; argv += 2; @@ -1836,8 +1865,6 @@ static int set_config_values(struct dm_cache_policy *p, int argc, const char **a static int create_cache_policy(struct cache *cache, struct cache_args *ca, char **error) { - int r; - cache->policy = dm_cache_policy_create(ca->policy_name, cache->cache_size, cache->origin_sectors, @@ -1847,14 +1874,7 @@ static int create_cache_policy(struct cache *cache, struct cache_args *ca, return -ENOMEM; } - r = set_config_values(cache->policy, ca->policy_argc, ca->policy_argv); - if (r) { - *error = "Error setting cache policy's config values"; - dm_cache_policy_destroy(cache->policy); - cache->policy = NULL; - } - - return r; + return 0; } /* @@ -1886,7 +1906,7 @@ static sector_t calculate_discard_block_size(sector_t cache_block_size, return discard_block_size; } -#define DEFAULT_MIGRATION_THRESHOLD (2048 * 100) +#define DEFAULT_MIGRATION_THRESHOLD 2048 static int cache_create(struct cache_args *ca, struct cache **result) { @@ -1911,7 +1931,7 @@ static int cache_create(struct cache_args *ca, struct cache **result) ti->discards_supported = true; ti->discard_zeroes_data_unsupported = true; - memcpy(&cache->features, &ca->features, sizeof(cache->features)); + cache->features = ca->features; ti->per_bio_data_size = get_per_bio_data_size(cache); cache->callbacks.congested_fn = cache_is_congested; @@ -1948,7 +1968,15 @@ static int cache_create(struct cache_args *ca, struct cache **result) r = create_cache_policy(cache, ca, error); if (r) goto bad; + cache->policy_nr_args = ca->policy_argc; + cache->migration_threshold = DEFAULT_MIGRATION_THRESHOLD; + + r = set_config_values(cache, ca->policy_argc, ca->policy_argv); + if (r) { + *error = "Error setting cache policy's config values"; + goto bad; + } cmd = dm_cache_metadata_open(cache->metadata_dev->bdev, ca->block_size, may_format, @@ -1967,10 +1995,10 @@ static int cache_create(struct cache_args *ca, struct cache **result) INIT_LIST_HEAD(&cache->quiesced_migrations); INIT_LIST_HEAD(&cache->completed_migrations); INIT_LIST_HEAD(&cache->need_commit_migrations); - cache->migration_threshold = DEFAULT_MIGRATION_THRESHOLD; atomic_set(&cache->nr_migrations, 0); init_waitqueue_head(&cache->migration_wait); + r = -ENOMEM; cache->nr_dirty = 0; cache->dirty_bitset = alloc_bitset(from_cblock(cache->cache_size)); if (!cache->dirty_bitset) { @@ -2517,23 +2545,6 @@ err: DMEMIT("Error"); } -#define NOT_CORE_OPTION 1 - -static int process_config_option(struct cache *cache, char **argv) -{ - unsigned long tmp; - - if (!strcasecmp(argv[0], "migration_threshold")) { - if (kstrtoul(argv[1], 10, &tmp)) - return -EINVAL; - - cache->migration_threshold = tmp; - return 0; - } - - return NOT_CORE_OPTION; -} - /* * Supports . * @@ -2541,17 +2552,12 @@ static int process_config_option(struct cache *cache, char **argv) */ static int cache_message(struct dm_target *ti, unsigned argc, char **argv) { - int r; struct cache *cache = ti->private; if (argc != 2) return -EINVAL; - r = process_config_option(cache, argv); - if (r == NOT_CORE_OPTION) - return policy_set_config_value(cache->policy, argv[0], argv[1]); - - return r; + return set_config_value(cache, argv[0], argv[1]); } static int cache_iterate_devices(struct dm_target *ti, @@ -2609,7 +2615,7 @@ static void cache_io_hints(struct dm_target *ti, struct queue_limits *limits) static struct target_type cache_target = { .name = "cache", - .version = {1, 1, 0}, + .version = {1, 1, 1}, .module = THIS_MODULE, .ctr = cache_ctr, .dtr = cache_dtr, diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 13c15480d940..6d2d41ae9e32 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -858,8 +858,7 @@ static void crypt_free_buffer_pages(struct crypt_config *cc, struct bio *clone) unsigned int i; struct bio_vec *bv; - for (i = 0; i < clone->bi_vcnt; i++) { - bv = bio_iovec_idx(clone, i); + bio_for_each_segment_all(bv, clone, i) { BUG_ON(!bv->bv_page); mempool_free(bv->bv_page, cc->page_pool); bv->bv_page = NULL; diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 51bb81676be3..bdf26f5bd326 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -907,6 +907,7 @@ static int multipath_ctr(struct dm_target *ti, unsigned int argc, ti->num_flush_bios = 1; ti->num_discard_bios = 1; + ti->num_write_same_bios = 1; return 0; diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c index d053098c6a91..699b5be68d31 100644 --- a/drivers/md/dm-raid1.c +++ b/drivers/md/dm-raid1.c @@ -458,7 +458,7 @@ static void map_region(struct dm_io_region *io, struct mirror *m, { io->bdev = m->dev->bdev; io->sector = map_sector(m, bio); - io->count = bio->bi_size >> 9; + io->count = bio_sectors(bio); } static void hold_bio(struct mirror_set *ms, struct bio *bio) diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index c0e07026a8d1..c434e5aab2df 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -1121,6 +1121,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv) s->pending_pool = mempool_create_slab_pool(MIN_IOS, pending_cache); if (!s->pending_pool) { ti->error = "Could not allocate mempool for pending exceptions"; + r = -ENOMEM; goto bad_pending_pool; } diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c index d8837d313f54..d907ca6227ce 100644 --- a/drivers/md/dm-stripe.c +++ b/drivers/md/dm-stripe.c @@ -94,7 +94,7 @@ static int get_stripe(struct dm_target *ti, struct stripe_c *sc, static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv) { struct stripe_c *sc; - sector_t width; + sector_t width, tmp_len; uint32_t stripes; uint32_t chunk_size; int r; @@ -116,15 +116,16 @@ static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv) } width = ti->len; - if (sector_div(width, chunk_size)) { + if (sector_div(width, stripes)) { ti->error = "Target length not divisible by " - "chunk size"; + "number of stripes"; return -EINVAL; } - if (sector_div(width, stripes)) { + tmp_len = width; + if (sector_div(tmp_len, chunk_size)) { ti->error = "Target length not divisible by " - "number of stripes"; + "chunk size"; return -EINVAL; } @@ -258,7 +259,7 @@ static int stripe_map_range(struct stripe_c *sc, struct bio *bio, sector_t begin, end; stripe_map_range_sector(sc, bio->bi_sector, target_stripe, &begin); - stripe_map_range_sector(sc, bio->bi_sector + bio_sectors(bio), + stripe_map_range_sector(sc, bio_end_sector(bio), target_stripe, &end); if (begin < end) { bio->bi_bdev = sc->stripe[target_stripe].dev->bdev; diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index e50dad0c65f4..1ff252ab7d46 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1442,7 +1442,7 @@ static bool dm_table_supports_write_same(struct dm_table *t) return false; if (!ti->type->iterate_devices || - !ti->type->iterate_devices(ti, device_not_write_same_capable, NULL)) + ti->type->iterate_devices(ti, device_not_write_same_capable, NULL)) return false; } diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index 00cee02f8fc9..60bce435f4fa 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -1645,12 +1645,12 @@ int dm_thin_get_highest_mapped_block(struct dm_thin_device *td, return r; } -static int __resize_data_dev(struct dm_pool_metadata *pmd, dm_block_t new_count) +static int __resize_space_map(struct dm_space_map *sm, dm_block_t new_count) { int r; dm_block_t old_count; - r = dm_sm_get_nr_blocks(pmd->data_sm, &old_count); + r = dm_sm_get_nr_blocks(sm, &old_count); if (r) return r; @@ -1658,11 +1658,11 @@ static int __resize_data_dev(struct dm_pool_metadata *pmd, dm_block_t new_count) return 0; if (new_count < old_count) { - DMERR("cannot reduce size of data device"); + DMERR("cannot reduce size of space map"); return -EINVAL; } - return dm_sm_extend(pmd->data_sm, new_count - old_count); + return dm_sm_extend(sm, new_count - old_count); } int dm_pool_resize_data_dev(struct dm_pool_metadata *pmd, dm_block_t new_count) @@ -1671,7 +1671,19 @@ int dm_pool_resize_data_dev(struct dm_pool_metadata *pmd, dm_block_t new_count) down_write(&pmd->root_lock); if (!pmd->fail_io) - r = __resize_data_dev(pmd, new_count); + r = __resize_space_map(pmd->data_sm, new_count); + up_write(&pmd->root_lock); + + return r; +} + +int dm_pool_resize_metadata_dev(struct dm_pool_metadata *pmd, dm_block_t new_count) +{ + int r = -EINVAL; + + down_write(&pmd->root_lock); + if (!pmd->fail_io) + r = __resize_space_map(pmd->metadata_sm, new_count); up_write(&pmd->root_lock); return r; @@ -1684,3 +1696,17 @@ void dm_pool_metadata_read_only(struct dm_pool_metadata *pmd) dm_bm_set_read_only(pmd->bm); up_write(&pmd->root_lock); } + +int dm_pool_register_metadata_threshold(struct dm_pool_metadata *pmd, + dm_block_t threshold, + dm_sm_threshold_fn fn, + void *context) +{ + int r; + + down_write(&pmd->root_lock); + r = dm_sm_register_threshold_callback(pmd->metadata_sm, threshold, fn, context); + up_write(&pmd->root_lock); + + return r; +} diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h index 0cecc3702885..845ebbe589a9 100644 --- a/drivers/md/dm-thin-metadata.h +++ b/drivers/md/dm-thin-metadata.h @@ -8,6 +8,7 @@ #define DM_THIN_METADATA_H #include "persistent-data/dm-block-manager.h" +#include "persistent-data/dm-space-map.h" #define THIN_METADATA_BLOCK_SIZE 4096 @@ -185,6 +186,7 @@ int dm_pool_get_data_dev_size(struct dm_pool_metadata *pmd, dm_block_t *result); * blocks would be lost. */ int dm_pool_resize_data_dev(struct dm_pool_metadata *pmd, dm_block_t new_size); +int dm_pool_resize_metadata_dev(struct dm_pool_metadata *pmd, dm_block_t new_size); /* * Flicks the underlying block manager into read only mode, so you know @@ -192,6 +194,11 @@ int dm_pool_resize_data_dev(struct dm_pool_metadata *pmd, dm_block_t new_size); */ void dm_pool_metadata_read_only(struct dm_pool_metadata *pmd); +int dm_pool_register_metadata_threshold(struct dm_pool_metadata *pmd, + dm_block_t threshold, + dm_sm_threshold_fn fn, + void *context); + /*----------------------------------------------------------------*/ #endif diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 004ad1652b73..759cffc45cab 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -922,7 +922,7 @@ static int alloc_data_block(struct thin_c *tc, dm_block_t *result) return r; if (free_blocks <= pool->low_water_blocks && !pool->low_water_triggered) { - DMWARN("%s: reached low water mark, sending event.", + DMWARN("%s: reached low water mark for data device: sending event.", dm_device_name(pool->pool_md)); spin_lock_irqsave(&pool->lock, flags); pool->low_water_triggered = 1; @@ -1281,6 +1281,10 @@ static void process_bio_fail(struct thin_c *tc, struct bio *bio) bio_io_error(bio); } +/* + * FIXME: should we also commit due to size of transaction, measured in + * metadata blocks? + */ static int need_commit_due_to_time(struct pool *pool) { return jiffies < pool->last_commit_jiffies || @@ -1909,6 +1913,56 @@ static int parse_pool_features(struct dm_arg_set *as, struct pool_features *pf, return r; } +static void metadata_low_callback(void *context) +{ + struct pool *pool = context; + + DMWARN("%s: reached low water mark for metadata device: sending event.", + dm_device_name(pool->pool_md)); + + dm_table_event(pool->ti->table); +} + +static sector_t get_metadata_dev_size(struct block_device *bdev) +{ + sector_t metadata_dev_size = i_size_read(bdev->bd_inode) >> SECTOR_SHIFT; + char buffer[BDEVNAME_SIZE]; + + if (metadata_dev_size > THIN_METADATA_MAX_SECTORS_WARNING) { + DMWARN("Metadata device %s is larger than %u sectors: excess space will not be used.", + bdevname(bdev, buffer), THIN_METADATA_MAX_SECTORS); + metadata_dev_size = THIN_METADATA_MAX_SECTORS_WARNING; + } + + return metadata_dev_size; +} + +static dm_block_t get_metadata_dev_size_in_blocks(struct block_device *bdev) +{ + sector_t metadata_dev_size = get_metadata_dev_size(bdev); + + sector_div(metadata_dev_size, THIN_METADATA_BLOCK_SIZE >> SECTOR_SHIFT); + + return metadata_dev_size; +} + +/* + * When a metadata threshold is crossed a dm event is triggered, and + * userland should respond by growing the metadata device. We could let + * userland set the threshold, like we do with the data threshold, but I'm + * not sure they know enough to do this well. + */ +static dm_block_t calc_metadata_threshold(struct pool_c *pt) +{ + /* + * 4M is ample for all ops with the possible exception of thin + * device deletion which is harmless if it fails (just retry the + * delete after you've grown the device). + */ + dm_block_t quarter = get_metadata_dev_size_in_blocks(pt->metadata_dev->bdev) / 4; + return min((dm_block_t)1024ULL /* 4M */, quarter); +} + /* * thin-pool * @@ -1931,8 +1985,7 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) unsigned long block_size; dm_block_t low_water_blocks; struct dm_dev *metadata_dev; - sector_t metadata_dev_size; - char b[BDEVNAME_SIZE]; + fmode_t metadata_mode; /* * FIXME Remove validation from scope of lock. @@ -1944,19 +1997,32 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) r = -EINVAL; goto out_unlock; } + as.argc = argc; as.argv = argv; - r = dm_get_device(ti, argv[0], FMODE_READ | FMODE_WRITE, &metadata_dev); + /* + * Set default pool features. + */ + pool_features_init(&pf); + + dm_consume_args(&as, 4); + r = parse_pool_features(&as, &pf, ti); + if (r) + goto out_unlock; + + metadata_mode = FMODE_READ | ((pf.mode == PM_READ_ONLY) ? 0 : FMODE_WRITE); + r = dm_get_device(ti, argv[0], metadata_mode, &metadata_dev); if (r) { ti->error = "Error opening metadata block device"; goto out_unlock; } - metadata_dev_size = i_size_read(metadata_dev->bdev->bd_inode) >> SECTOR_SHIFT; - if (metadata_dev_size > THIN_METADATA_MAX_SECTORS_WARNING) - DMWARN("Metadata device %s is larger than %u sectors: excess space will not be used.", - bdevname(metadata_dev->bdev, b), THIN_METADATA_MAX_SECTORS); + /* + * Run for the side-effect of possibly issuing a warning if the + * device is too big. + */ + (void) get_metadata_dev_size(metadata_dev->bdev); r = dm_get_device(ti, argv[1], FMODE_READ | FMODE_WRITE, &data_dev); if (r) { @@ -1979,16 +2045,6 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) goto out; } - /* - * Set default pool features. - */ - pool_features_init(&pf); - - dm_consume_args(&as, 4); - r = parse_pool_features(&as, &pf, ti); - if (r) - goto out; - pt = kzalloc(sizeof(*pt), GFP_KERNEL); if (!pt) { r = -ENOMEM; @@ -2040,6 +2096,13 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) } ti->private = pt; + r = dm_pool_register_metadata_threshold(pt->pool->pmd, + calc_metadata_threshold(pt), + metadata_low_callback, + pool); + if (r) + goto out_free_pt; + pt->callbacks.congested_fn = pool_is_congested; dm_table_add_target_callbacks(ti->table, &pt->callbacks); @@ -2079,18 +2142,7 @@ static int pool_map(struct dm_target *ti, struct bio *bio) return r; } -/* - * Retrieves the number of blocks of the data device from - * the superblock and compares it to the actual device size, - * thus resizing the data device in case it has grown. - * - * This both copes with opening preallocated data devices in the ctr - * being followed by a resume - * -and- - * calling the resume method individually after userspace has - * grown the data device in reaction to a table event. - */ -static int pool_preresume(struct dm_target *ti) +static int maybe_resize_data_dev(struct dm_target *ti, bool *need_commit) { int r; struct pool_c *pt = ti->private; @@ -2098,12 +2150,7 @@ static int pool_preresume(struct dm_target *ti) sector_t data_size = ti->len; dm_block_t sb_data_size; - /* - * Take control of the pool object. - */ - r = bind_control_target(pool, ti); - if (r) - return r; + *need_commit = false; (void) sector_div(data_size, pool->sectors_per_block); @@ -2114,7 +2161,7 @@ static int pool_preresume(struct dm_target *ti) } if (data_size < sb_data_size) { - DMERR("pool target too small, is %llu blocks (expected %llu)", + DMERR("pool target (%llu blocks) too small: expected %llu", (unsigned long long)data_size, sb_data_size); return -EINVAL; @@ -2122,17 +2169,90 @@ static int pool_preresume(struct dm_target *ti) r = dm_pool_resize_data_dev(pool->pmd, data_size); if (r) { DMERR("failed to resize data device"); - /* FIXME Stricter than necessary: Rollback transaction instead here */ set_pool_mode(pool, PM_READ_ONLY); return r; } - (void) commit_or_fallback(pool); + *need_commit = true; } return 0; } +static int maybe_resize_metadata_dev(struct dm_target *ti, bool *need_commit) +{ + int r; + struct pool_c *pt = ti->private; + struct pool *pool = pt->pool; + dm_block_t metadata_dev_size, sb_metadata_dev_size; + + *need_commit = false; + + metadata_dev_size = get_metadata_dev_size(pool->md_dev); + + r = dm_pool_get_metadata_dev_size(pool->pmd, &sb_metadata_dev_size); + if (r) { + DMERR("failed to retrieve data device size"); + return r; + } + + if (metadata_dev_size < sb_metadata_dev_size) { + DMERR("metadata device (%llu sectors) too small: expected %llu", + metadata_dev_size, sb_metadata_dev_size); + return -EINVAL; + + } else if (metadata_dev_size > sb_metadata_dev_size) { + r = dm_pool_resize_metadata_dev(pool->pmd, metadata_dev_size); + if (r) { + DMERR("failed to resize metadata device"); + return r; + } + + *need_commit = true; + } + + return 0; +} + +/* + * Retrieves the number of blocks of the data device from + * the superblock and compares it to the actual device size, + * thus resizing the data device in case it has grown. + * + * This both copes with opening preallocated data devices in the ctr + * being followed by a resume + * -and- + * calling the resume method individually after userspace has + * grown the data device in reaction to a table event. + */ +static int pool_preresume(struct dm_target *ti) +{ + int r; + bool need_commit1, need_commit2; + struct pool_c *pt = ti->private; + struct pool *pool = pt->pool; + + /* + * Take control of the pool object. + */ + r = bind_control_target(pool, ti); + if (r) + return r; + + r = maybe_resize_data_dev(ti, &need_commit1); + if (r) + return r; + + r = maybe_resize_metadata_dev(ti, &need_commit2); + if (r) + return r; + + if (need_commit1 || need_commit2) + (void) commit_or_fallback(pool); + + return 0; +} + static void pool_resume(struct dm_target *ti) { struct pool_c *pt = ti->private; @@ -2549,7 +2669,7 @@ static struct target_type pool_target = { .name = "thin-pool", .features = DM_TARGET_SINGLETON | DM_TARGET_ALWAYS_WRITEABLE | DM_TARGET_IMMUTABLE, - .version = {1, 7, 0}, + .version = {1, 8, 0}, .module = THIS_MODULE, .ctr = pool_ctr, .dtr = pool_dtr, diff --git a/drivers/md/dm-verity.c b/drivers/md/dm-verity.c index a746f1d21c66..b948fd864d45 100644 --- a/drivers/md/dm-verity.c +++ b/drivers/md/dm-verity.c @@ -501,7 +501,7 @@ static int verity_map(struct dm_target *ti, struct bio *bio) return -EIO; } - if ((bio->bi_sector + bio_sectors(bio)) >> + if (bio_end_sector(bio) >> (v->data_dev_block_bits - SECTOR_SHIFT) > v->data_blocks) { DMERR_LIMIT("io out of range"); return -EIO; @@ -519,7 +519,7 @@ static int verity_map(struct dm_target *ti, struct bio *bio) bio->bi_end_io = verity_end_io; bio->bi_private = io; - io->io_vec_size = bio->bi_vcnt - bio->bi_idx; + io->io_vec_size = bio_segments(bio); if (io->io_vec_size < DM_VERITY_IO_VEC_INLINE) io->io_vec = io->io_vec_inline; else diff --git a/drivers/md/faulty.c b/drivers/md/faulty.c index 5e7dc772f5de..3193aefe982b 100644 --- a/drivers/md/faulty.c +++ b/drivers/md/faulty.c @@ -185,8 +185,7 @@ static void make_request(struct mddev *mddev, struct bio *bio) return; } - if (check_sector(conf, bio->bi_sector, bio->bi_sector+(bio->bi_size>>9), - WRITE)) + if (check_sector(conf, bio->bi_sector, bio_end_sector(bio), WRITE)) failit = 1; if (check_mode(conf, WritePersistent)) { add_sector(conf, bio->bi_sector, WritePersistent); @@ -196,8 +195,7 @@ static void make_request(struct mddev *mddev, struct bio *bio) failit = 1; } else { /* read request */ - if (check_sector(conf, bio->bi_sector, bio->bi_sector + (bio->bi_size>>9), - READ)) + if (check_sector(conf, bio->bi_sector, bio_end_sector(bio), READ)) failit = 1; if (check_mode(conf, ReadTransient)) failit = 1; diff --git a/drivers/md/linear.c b/drivers/md/linear.c index 21014836bdbf..f03fabd2b37b 100644 --- a/drivers/md/linear.c +++ b/drivers/md/linear.c @@ -317,8 +317,7 @@ static void linear_make_request(struct mddev *mddev, struct bio *bio) bio_io_error(bio); return; } - if (unlikely(bio->bi_sector + (bio->bi_size >> 9) > - tmp_dev->end_sector)) { + if (unlikely(bio_end_sector(bio) > tmp_dev->end_sector)) { /* This bio crosses a device boundary, so we have to * split it. */ diff --git a/drivers/md/md.c b/drivers/md/md.c index 6330c727396c..681d1099a2d5 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -197,21 +197,12 @@ void md_trim_bio(struct bio *bio, int offset, int size) if (offset == 0 && size == bio->bi_size) return; - bio->bi_sector += offset; - bio->bi_size = size; - offset <<= 9; clear_bit(BIO_SEG_VALID, &bio->bi_flags); - while (bio->bi_idx < bio->bi_vcnt && - bio->bi_io_vec[bio->bi_idx].bv_len <= offset) { - /* remove this whole bio_vec */ - offset -= bio->bi_io_vec[bio->bi_idx].bv_len; - bio->bi_idx++; - } - if (bio->bi_idx < bio->bi_vcnt) { - bio->bi_io_vec[bio->bi_idx].bv_offset += offset; - bio->bi_io_vec[bio->bi_idx].bv_len -= offset; - } + bio_advance(bio, offset << 9); + + bio->bi_size = size; + /* avoid any complications with bi_idx being non-zero*/ if (bio->bi_idx) { memmove(bio->bi_io_vec, bio->bi_io_vec+bio->bi_idx, diff --git a/drivers/md/persistent-data/dm-space-map-disk.c b/drivers/md/persistent-data/dm-space-map-disk.c index f6d29e614ab7..e735a6d5a793 100644 --- a/drivers/md/persistent-data/dm-space-map-disk.c +++ b/drivers/md/persistent-data/dm-space-map-disk.c @@ -248,7 +248,8 @@ static struct dm_space_map ops = { .new_block = sm_disk_new_block, .commit = sm_disk_commit, .root_size = sm_disk_root_size, - .copy_root = sm_disk_copy_root + .copy_root = sm_disk_copy_root, + .register_threshold_callback = NULL }; struct dm_space_map *dm_sm_disk_create(struct dm_transaction_manager *tm, diff --git a/drivers/md/persistent-data/dm-space-map-metadata.c b/drivers/md/persistent-data/dm-space-map-metadata.c index 906cf3df71af..1c959684caef 100644 --- a/drivers/md/persistent-data/dm-space-map-metadata.c +++ b/drivers/md/persistent-data/dm-space-map-metadata.c @@ -16,6 +16,55 @@ /*----------------------------------------------------------------*/ +/* + * An edge triggered threshold. + */ +struct threshold { + bool threshold_set; + bool value_set; + dm_block_t threshold; + dm_block_t current_value; + dm_sm_threshold_fn fn; + void *context; +}; + +static void threshold_init(struct threshold *t) +{ + t->threshold_set = false; + t->value_set = false; +} + +static void set_threshold(struct threshold *t, dm_block_t value, + dm_sm_threshold_fn fn, void *context) +{ + t->threshold_set = true; + t->threshold = value; + t->fn = fn; + t->context = context; +} + +static bool below_threshold(struct threshold *t, dm_block_t value) +{ + return t->threshold_set && value <= t->threshold; +} + +static bool threshold_already_triggered(struct threshold *t) +{ + return t->value_set && below_threshold(t, t->current_value); +} + +static void check_threshold(struct threshold *t, dm_block_t value) +{ + if (below_threshold(t, value) && + !threshold_already_triggered(t)) + t->fn(t->context); + + t->value_set = true; + t->current_value = value; +} + +/*----------------------------------------------------------------*/ + /* * Space map interface. * @@ -54,6 +103,8 @@ struct sm_metadata { unsigned allocated_this_transaction; unsigned nr_uncommitted; struct block_op uncommitted[MAX_RECURSIVE_ALLOCATIONS]; + + struct threshold threshold; }; static int add_bop(struct sm_metadata *smm, enum block_op_type type, dm_block_t b) @@ -144,12 +195,6 @@ static void sm_metadata_destroy(struct dm_space_map *sm) kfree(smm); } -static int sm_metadata_extend(struct dm_space_map *sm, dm_block_t extra_blocks) -{ - DMERR("doesn't support extend"); - return -EINVAL; -} - static int sm_metadata_get_nr_blocks(struct dm_space_map *sm, dm_block_t *count) { struct sm_metadata *smm = container_of(sm, struct sm_metadata, sm); @@ -335,9 +380,19 @@ static int sm_metadata_new_block_(struct dm_space_map *sm, dm_block_t *b) static int sm_metadata_new_block(struct dm_space_map *sm, dm_block_t *b) { + dm_block_t count; + struct sm_metadata *smm = container_of(sm, struct sm_metadata, sm); + int r = sm_metadata_new_block_(sm, b); if (r) DMERR("unable to allocate new metadata block"); + + r = sm_metadata_get_nr_free(sm, &count); + if (r) + DMERR("couldn't get free block count"); + + check_threshold(&smm->threshold, count); + return r; } @@ -357,6 +412,18 @@ static int sm_metadata_commit(struct dm_space_map *sm) return 0; } +static int sm_metadata_register_threshold_callback(struct dm_space_map *sm, + dm_block_t threshold, + dm_sm_threshold_fn fn, + void *context) +{ + struct sm_metadata *smm = container_of(sm, struct sm_metadata, sm); + + set_threshold(&smm->threshold, threshold, fn, context); + + return 0; +} + static int sm_metadata_root_size(struct dm_space_map *sm, size_t *result) { *result = sizeof(struct disk_sm_root); @@ -382,6 +449,8 @@ static int sm_metadata_copy_root(struct dm_space_map *sm, void *where_le, size_t return 0; } +static int sm_metadata_extend(struct dm_space_map *sm, dm_block_t extra_blocks); + static struct dm_space_map ops = { .destroy = sm_metadata_destroy, .extend = sm_metadata_extend, @@ -395,7 +464,8 @@ static struct dm_space_map ops = { .new_block = sm_metadata_new_block, .commit = sm_metadata_commit, .root_size = sm_metadata_root_size, - .copy_root = sm_metadata_copy_root + .copy_root = sm_metadata_copy_root, + .register_threshold_callback = sm_metadata_register_threshold_callback }; /*----------------------------------------------------------------*/ @@ -410,7 +480,7 @@ static void sm_bootstrap_destroy(struct dm_space_map *sm) static int sm_bootstrap_extend(struct dm_space_map *sm, dm_block_t extra_blocks) { - DMERR("boostrap doesn't support extend"); + DMERR("bootstrap doesn't support extend"); return -EINVAL; } @@ -450,7 +520,7 @@ static int sm_bootstrap_count_is_more_than_one(struct dm_space_map *sm, static int sm_bootstrap_set_count(struct dm_space_map *sm, dm_block_t b, uint32_t count) { - DMERR("boostrap doesn't support set_count"); + DMERR("bootstrap doesn't support set_count"); return -EINVAL; } @@ -491,7 +561,7 @@ static int sm_bootstrap_commit(struct dm_space_map *sm) static int sm_bootstrap_root_size(struct dm_space_map *sm, size_t *result) { - DMERR("boostrap doesn't support root_size"); + DMERR("bootstrap doesn't support root_size"); return -EINVAL; } @@ -499,7 +569,7 @@ static int sm_bootstrap_root_size(struct dm_space_map *sm, size_t *result) static int sm_bootstrap_copy_root(struct dm_space_map *sm, void *where, size_t max) { - DMERR("boostrap doesn't support copy_root"); + DMERR("bootstrap doesn't support copy_root"); return -EINVAL; } @@ -517,11 +587,42 @@ static struct dm_space_map bootstrap_ops = { .new_block = sm_bootstrap_new_block, .commit = sm_bootstrap_commit, .root_size = sm_bootstrap_root_size, - .copy_root = sm_bootstrap_copy_root + .copy_root = sm_bootstrap_copy_root, + .register_threshold_callback = NULL }; /*----------------------------------------------------------------*/ +static int sm_metadata_extend(struct dm_space_map *sm, dm_block_t extra_blocks) +{ + int r, i; + enum allocation_event ev; + struct sm_metadata *smm = container_of(sm, struct sm_metadata, sm); + dm_block_t old_len = smm->ll.nr_blocks; + + /* + * Flick into a mode where all blocks get allocated in the new area. + */ + smm->begin = old_len; + memcpy(&smm->sm, &bootstrap_ops, sizeof(smm->sm)); + + /* + * Extend. + */ + r = sm_ll_extend(&smm->ll, extra_blocks); + + /* + * Switch back to normal behaviour. + */ + memcpy(&smm->sm, &ops, sizeof(smm->sm)); + for (i = old_len; !r && i < smm->begin; i++) + r = sm_ll_inc(&smm->ll, i, &ev); + + return r; +} + +/*----------------------------------------------------------------*/ + struct dm_space_map *dm_sm_metadata_init(void) { struct sm_metadata *smm; @@ -549,6 +650,7 @@ int dm_sm_metadata_create(struct dm_space_map *sm, smm->recursion_count = 0; smm->allocated_this_transaction = 0; smm->nr_uncommitted = 0; + threshold_init(&smm->threshold); memcpy(&smm->sm, &bootstrap_ops, sizeof(smm->sm)); @@ -590,6 +692,7 @@ int dm_sm_metadata_open(struct dm_space_map *sm, smm->recursion_count = 0; smm->allocated_this_transaction = 0; smm->nr_uncommitted = 0; + threshold_init(&smm->threshold); memcpy(&smm->old_ll, &smm->ll, sizeof(smm->old_ll)); return 0; diff --git a/drivers/md/persistent-data/dm-space-map.h b/drivers/md/persistent-data/dm-space-map.h index 1cbfc6b1638a..3e6d1153b7c4 100644 --- a/drivers/md/persistent-data/dm-space-map.h +++ b/drivers/md/persistent-data/dm-space-map.h @@ -9,6 +9,8 @@ #include "dm-block-manager.h" +typedef void (*dm_sm_threshold_fn)(void *context); + /* * struct dm_space_map keeps a record of how many times each block in a device * is referenced. It needs to be fixed on disk as part of the transaction. @@ -59,6 +61,15 @@ struct dm_space_map { */ int (*root_size)(struct dm_space_map *sm, size_t *result); int (*copy_root)(struct dm_space_map *sm, void *copy_to_here_le, size_t len); + + /* + * You can register one threshold callback which is edge-triggered + * when the free space in the space map drops below the threshold. + */ + int (*register_threshold_callback)(struct dm_space_map *sm, + dm_block_t threshold, + dm_sm_threshold_fn fn, + void *context); }; /*----------------------------------------------------------------*/ @@ -131,4 +142,16 @@ static inline int dm_sm_copy_root(struct dm_space_map *sm, void *copy_to_here_le return sm->copy_root(sm, copy_to_here_le, len); } +static inline int dm_sm_register_threshold_callback(struct dm_space_map *sm, + dm_block_t threshold, + dm_sm_threshold_fn fn, + void *context) +{ + if (sm->register_threshold_callback) + return sm->register_threshold_callback(sm, threshold, fn, context); + + return -EINVAL; +} + + #endif /* _LINUX_DM_SPACE_MAP_H */ diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 0505452de8d6..fcf65e512cf5 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -502,11 +502,11 @@ static inline int is_io_in_chunk_boundary(struct mddev *mddev, { if (likely(is_power_of_2(chunk_sects))) { return chunk_sects >= ((bio->bi_sector & (chunk_sects-1)) - + (bio->bi_size >> 9)); + + bio_sectors(bio)); } else{ sector_t sector = bio->bi_sector; return chunk_sects >= (sector_div(sector, chunk_sects) - + (bio->bi_size >> 9)); + + bio_sectors(bio)); } } @@ -527,8 +527,7 @@ static void raid0_make_request(struct mddev *mddev, struct bio *bio) sector_t sector = bio->bi_sector; struct bio_pair *bp; /* Sanity check -- queue functions should prevent this happening */ - if ((bio->bi_vcnt != 1 && bio->bi_vcnt != 0) || - bio->bi_idx != 0) + if (bio_segments(bio) > 1) goto bad_map; /* This is a one page bio that upper layers * refuse to split for us, so we need to split it. @@ -567,7 +566,7 @@ bad_map: printk("md/raid0:%s: make_request bug: can't convert block across chunks" " or bigger than %dk %llu %d\n", mdname(mddev), chunk_sects / 2, - (unsigned long long)bio->bi_sector, bio->bi_size >> 10); + (unsigned long long)bio->bi_sector, bio_sectors(bio) / 2); bio_io_error(bio); return; diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 851023e2ba5d..55951182af73 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -92,7 +92,6 @@ static void r1bio_pool_free(void *r1_bio, void *data) static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) { struct pool_info *pi = data; - struct page *page; struct r1bio *r1_bio; struct bio *bio; int i, j; @@ -122,14 +121,10 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) j = 1; while(j--) { bio = r1_bio->bios[j]; - for (i = 0; i < RESYNC_PAGES; i++) { - page = alloc_page(gfp_flags); - if (unlikely(!page)) - goto out_free_pages; + bio->bi_vcnt = RESYNC_PAGES; - bio->bi_io_vec[i].bv_page = page; - bio->bi_vcnt = i+1; - } + if (bio_alloc_pages(bio, gfp_flags)) + goto out_free_bio; } /* If not user-requests, copy the page pointers to all bios */ if (!test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery)) { @@ -143,11 +138,6 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) return r1_bio; -out_free_pages: - for (j=0 ; j < pi->raid_disks; j++) - for (i=0; i < r1_bio->bios[j]->bi_vcnt ; i++) - put_page(r1_bio->bios[j]->bi_io_vec[i].bv_page); - j = -1; out_free_bio: while (++j < pi->raid_disks) bio_put(r1_bio->bios[j]); @@ -267,7 +257,7 @@ static void raid_end_bio_io(struct r1bio *r1_bio) (bio_data_dir(bio) == WRITE) ? "write" : "read", (unsigned long long) bio->bi_sector, (unsigned long long) bio->bi_sector + - (bio->bi_size >> 9) - 1); + bio_sectors(bio) - 1); call_bio_endio(r1_bio); } @@ -458,7 +448,7 @@ static void raid1_end_write_request(struct bio *bio, int error) " %llu-%llu\n", (unsigned long long) mbio->bi_sector, (unsigned long long) mbio->bi_sector + - (mbio->bi_size >> 9) - 1); + bio_sectors(mbio) - 1); call_bio_endio(r1_bio); } } @@ -925,7 +915,7 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio) if (unlikely(!bvecs)) return; - bio_for_each_segment(bvec, bio, i) { + bio_for_each_segment_all(bvec, bio, i) { bvecs[i] = *bvec; bvecs[i].bv_page = alloc_page(GFP_NOIO); if (unlikely(!bvecs[i].bv_page)) @@ -1023,7 +1013,7 @@ static void make_request(struct mddev *mddev, struct bio * bio) md_write_start(mddev, bio); /* wait on superblock update early */ if (bio_data_dir(bio) == WRITE && - bio->bi_sector + bio->bi_size/512 > mddev->suspend_lo && + bio_end_sector(bio) > mddev->suspend_lo && bio->bi_sector < mddev->suspend_hi) { /* As the suspend_* range is controlled by * userspace, we want an interruptible @@ -1034,7 +1024,7 @@ static void make_request(struct mddev *mddev, struct bio * bio) flush_signals(current); prepare_to_wait(&conf->wait_barrier, &w, TASK_INTERRUPTIBLE); - if (bio->bi_sector + bio->bi_size/512 <= mddev->suspend_lo || + if (bio_end_sector(bio) <= mddev->suspend_lo || bio->bi_sector >= mddev->suspend_hi) break; schedule(); @@ -1054,7 +1044,7 @@ static void make_request(struct mddev *mddev, struct bio * bio) r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO); r1_bio->master_bio = bio; - r1_bio->sectors = bio->bi_size >> 9; + r1_bio->sectors = bio_sectors(bio); r1_bio->state = 0; r1_bio->mddev = mddev; r1_bio->sector = bio->bi_sector; @@ -1132,7 +1122,7 @@ read_again: r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO); r1_bio->master_bio = bio; - r1_bio->sectors = (bio->bi_size >> 9) - sectors_handled; + r1_bio->sectors = bio_sectors(bio) - sectors_handled; r1_bio->state = 0; r1_bio->mddev = mddev; r1_bio->sector = bio->bi_sector + sectors_handled; @@ -1289,14 +1279,10 @@ read_again: struct bio_vec *bvec; int j; - /* Yes, I really want the '__' version so that - * we clear any unused pointer in the io_vec, rather - * than leave them unchanged. This is important - * because when we come to free the pages, we won't - * know the original bi_idx, so we just free - * them all + /* + * We trimmed the bio, so _all is legit */ - __bio_for_each_segment(bvec, mbio, j, 0) + bio_for_each_segment_all(bvec, mbio, j) bvec->bv_page = r1_bio->behind_bvecs[j].bv_page; if (test_bit(WriteMostly, &conf->mirrors[i].rdev->flags)) atomic_inc(&r1_bio->behind_remaining); @@ -1334,14 +1320,14 @@ read_again: /* Mustn't call r1_bio_write_done before this next test, * as it could result in the bio being freed. */ - if (sectors_handled < (bio->bi_size >> 9)) { + if (sectors_handled < bio_sectors(bio)) { r1_bio_write_done(r1_bio); /* We need another r1_bio. It has already been counted * in bio->bi_phys_segments */ r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO); r1_bio->master_bio = bio; - r1_bio->sectors = (bio->bi_size >> 9) - sectors_handled; + r1_bio->sectors = bio_sectors(bio) - sectors_handled; r1_bio->state = 0; r1_bio->mddev = mddev; r1_bio->sector = bio->bi_sector + sectors_handled; @@ -1867,7 +1853,7 @@ static int process_checks(struct r1bio *r1_bio) struct bio *sbio = r1_bio->bios[i]; int size; - if (r1_bio->bios[i]->bi_end_io != end_sync_read) + if (sbio->bi_end_io != end_sync_read) continue; if (test_bit(BIO_UPTODATE, &sbio->bi_flags)) { @@ -1892,16 +1878,15 @@ static int process_checks(struct r1bio *r1_bio) continue; } /* fixup the bio for reuse */ + bio_reset(sbio); sbio->bi_vcnt = vcnt; sbio->bi_size = r1_bio->sectors << 9; - sbio->bi_idx = 0; - sbio->bi_phys_segments = 0; - sbio->bi_flags &= ~(BIO_POOL_MASK - 1); - sbio->bi_flags |= 1 << BIO_UPTODATE; - sbio->bi_next = NULL; sbio->bi_sector = r1_bio->sector + conf->mirrors[i].rdev->data_offset; sbio->bi_bdev = conf->mirrors[i].rdev->bdev; + sbio->bi_end_io = end_sync_read; + sbio->bi_private = r1_bio; + size = sbio->bi_size; for (j = 0; j < vcnt ; j++) { struct bio_vec *bi; @@ -1912,10 +1897,9 @@ static int process_checks(struct r1bio *r1_bio) else bi->bv_len = size; size -= PAGE_SIZE; - memcpy(page_address(bi->bv_page), - page_address(pbio->bi_io_vec[j].bv_page), - PAGE_SIZE); } + + bio_copy_data(sbio, pbio); } return 0; } @@ -1952,7 +1936,7 @@ static void sync_request_write(struct mddev *mddev, struct r1bio *r1_bio) wbio->bi_rw = WRITE; wbio->bi_end_io = end_sync_write; atomic_inc(&r1_bio->remaining); - md_sync_acct(conf->mirrors[i].rdev->bdev, wbio->bi_size >> 9); + md_sync_acct(conf->mirrors[i].rdev->bdev, bio_sectors(wbio)); generic_make_request(wbio); } @@ -2064,32 +2048,11 @@ static void fix_read_error(struct r1conf *conf, int read_disk, } } -static void bi_complete(struct bio *bio, int error) -{ - complete((struct completion *)bio->bi_private); -} - -static int submit_bio_wait(int rw, struct bio *bio) -{ - struct completion event; - rw |= REQ_SYNC; - - init_completion(&event); - bio->bi_private = &event; - bio->bi_end_io = bi_complete; - submit_bio(rw, bio); - wait_for_completion(&event); - - return test_bit(BIO_UPTODATE, &bio->bi_flags); -} - static int narrow_write_error(struct r1bio *r1_bio, int i) { struct mddev *mddev = r1_bio->mddev; struct r1conf *conf = mddev->private; struct md_rdev *rdev = conf->mirrors[i].rdev; - int vcnt, idx; - struct bio_vec *vec; /* bio has the data to be written to device 'i' where * we just recently had a write error. @@ -2117,30 +2080,32 @@ static int narrow_write_error(struct r1bio *r1_bio, int i) & ~(sector_t)(block_sectors - 1)) - sector; - if (test_bit(R1BIO_BehindIO, &r1_bio->state)) { - vcnt = r1_bio->behind_page_count; - vec = r1_bio->behind_bvecs; - idx = 0; - while (vec[idx].bv_page == NULL) - idx++; - } else { - vcnt = r1_bio->master_bio->bi_vcnt; - vec = r1_bio->master_bio->bi_io_vec; - idx = r1_bio->master_bio->bi_idx; - } while (sect_to_write) { struct bio *wbio; if (sectors > sect_to_write) sectors = sect_to_write; /* Write at 'sector' for 'sectors'*/ - wbio = bio_alloc_mddev(GFP_NOIO, vcnt, mddev); - memcpy(wbio->bi_io_vec, vec, vcnt * sizeof(struct bio_vec)); - wbio->bi_sector = r1_bio->sector; + if (test_bit(R1BIO_BehindIO, &r1_bio->state)) { + unsigned vcnt = r1_bio->behind_page_count; + struct bio_vec *vec = r1_bio->behind_bvecs; + + while (!vec->bv_page) { + vec++; + vcnt--; + } + + wbio = bio_alloc_mddev(GFP_NOIO, vcnt, mddev); + memcpy(wbio->bi_io_vec, vec, vcnt * sizeof(struct bio_vec)); + + wbio->bi_vcnt = vcnt; + } else { + wbio = bio_clone_mddev(r1_bio->master_bio, GFP_NOIO, mddev); + } + wbio->bi_rw = WRITE; - wbio->bi_vcnt = vcnt; + wbio->bi_sector = r1_bio->sector; wbio->bi_size = r1_bio->sectors << 9; - wbio->bi_idx = idx; md_trim_bio(wbio, sector - r1_bio->sector, sectors); wbio->bi_sector += rdev->data_offset; @@ -2289,8 +2254,7 @@ read_more: r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO); r1_bio->master_bio = mbio; - r1_bio->sectors = (mbio->bi_size >> 9) - - sectors_handled; + r1_bio->sectors = bio_sectors(mbio) - sectors_handled; r1_bio->state = 0; set_bit(R1BIO_ReadError, &r1_bio->state); r1_bio->mddev = mddev; @@ -2464,18 +2428,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp for (i = 0; i < conf->raid_disks * 2; i++) { struct md_rdev *rdev; bio = r1_bio->bios[i]; - - /* take from bio_init */ - bio->bi_next = NULL; - bio->bi_flags &= ~(BIO_POOL_MASK-1); - bio->bi_flags |= 1 << BIO_UPTODATE; - bio->bi_rw = READ; - bio->bi_vcnt = 0; - bio->bi_idx = 0; - bio->bi_phys_segments = 0; - bio->bi_size = 0; - bio->bi_end_io = NULL; - bio->bi_private = NULL; + bio_reset(bio); rdev = rcu_dereference(conf->mirrors[i].rdev); if (rdev == NULL || diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 018741ba9310..59d4daa5f4c7 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1174,14 +1174,13 @@ static void make_request(struct mddev *mddev, struct bio * bio) /* If this request crosses a chunk boundary, we need to * split it. This will only happen for 1 PAGE (or less) requests. */ - if (unlikely((bio->bi_sector & chunk_mask) + (bio->bi_size >> 9) + if (unlikely((bio->bi_sector & chunk_mask) + bio_sectors(bio) > chunk_sects && (conf->geo.near_copies < conf->geo.raid_disks || conf->prev.near_copies < conf->prev.raid_disks))) { struct bio_pair *bp; /* Sanity check -- queue functions should prevent this happening */ - if ((bio->bi_vcnt != 1 && bio->bi_vcnt != 0) || - bio->bi_idx != 0) + if (bio_segments(bio) > 1) goto bad_map; /* This is a one page bio that upper layers * refuse to split for us, so we need to split it. @@ -1214,7 +1213,7 @@ static void make_request(struct mddev *mddev, struct bio * bio) bad_map: printk("md/raid10:%s: make_request bug: can't convert block across chunks" " or bigger than %dk %llu %d\n", mdname(mddev), chunk_sects/2, - (unsigned long long)bio->bi_sector, bio->bi_size >> 10); + (unsigned long long)bio->bi_sector, bio_sectors(bio) / 2); bio_io_error(bio); return; @@ -1229,7 +1228,7 @@ static void make_request(struct mddev *mddev, struct bio * bio) */ wait_barrier(conf); - sectors = bio->bi_size >> 9; + sectors = bio_sectors(bio); while (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && bio->bi_sector < conf->reshape_progress && bio->bi_sector + sectors > conf->reshape_progress) { @@ -1331,8 +1330,7 @@ read_again: r10_bio = mempool_alloc(conf->r10bio_pool, GFP_NOIO); r10_bio->master_bio = bio; - r10_bio->sectors = ((bio->bi_size >> 9) - - sectors_handled); + r10_bio->sectors = bio_sectors(bio) - sectors_handled; r10_bio->state = 0; r10_bio->mddev = mddev; r10_bio->sector = bio->bi_sector + sectors_handled; @@ -1574,7 +1572,7 @@ retry_write: * after checking if we need to go around again. */ - if (sectors_handled < (bio->bi_size >> 9)) { + if (sectors_handled < bio_sectors(bio)) { one_write_done(r10_bio); /* We need another r10_bio. It has already been counted * in bio->bi_phys_segments. @@ -1582,7 +1580,7 @@ retry_write: r10_bio = mempool_alloc(conf->r10bio_pool, GFP_NOIO); r10_bio->master_bio = bio; - r10_bio->sectors = (bio->bi_size >> 9) - sectors_handled; + r10_bio->sectors = bio_sectors(bio) - sectors_handled; r10_bio->mddev = mddev; r10_bio->sector = bio->bi_sector + sectors_handled; @@ -2084,13 +2082,10 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio) * First we need to fixup bv_offset, bv_len and * bi_vecs, as the read request might have corrupted these */ + bio_reset(tbio); + tbio->bi_vcnt = vcnt; tbio->bi_size = r10_bio->sectors << 9; - tbio->bi_idx = 0; - tbio->bi_phys_segments = 0; - tbio->bi_flags &= ~(BIO_POOL_MASK - 1); - tbio->bi_flags |= 1 << BIO_UPTODATE; - tbio->bi_next = NULL; tbio->bi_rw = WRITE; tbio->bi_private = r10_bio; tbio->bi_sector = r10_bio->devs[i].addr; @@ -2108,7 +2103,7 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio) d = r10_bio->devs[i].devnum; atomic_inc(&conf->mirrors[d].rdev->nr_pending); atomic_inc(&r10_bio->remaining); - md_sync_acct(conf->mirrors[d].rdev->bdev, tbio->bi_size >> 9); + md_sync_acct(conf->mirrors[d].rdev->bdev, bio_sectors(tbio)); tbio->bi_sector += conf->mirrors[d].rdev->data_offset; tbio->bi_bdev = conf->mirrors[d].rdev->bdev; @@ -2133,7 +2128,7 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio) d = r10_bio->devs[i].devnum; atomic_inc(&r10_bio->remaining); md_sync_acct(conf->mirrors[d].replacement->bdev, - tbio->bi_size >> 9); + bio_sectors(tbio)); generic_make_request(tbio); } @@ -2259,13 +2254,13 @@ static void recovery_request_write(struct mddev *mddev, struct r10bio *r10_bio) wbio2 = r10_bio->devs[1].repl_bio; if (wbio->bi_end_io) { atomic_inc(&conf->mirrors[d].rdev->nr_pending); - md_sync_acct(conf->mirrors[d].rdev->bdev, wbio->bi_size >> 9); + md_sync_acct(conf->mirrors[d].rdev->bdev, bio_sectors(wbio)); generic_make_request(wbio); } if (wbio2 && wbio2->bi_end_io) { atomic_inc(&conf->mirrors[d].replacement->nr_pending); md_sync_acct(conf->mirrors[d].replacement->bdev, - wbio2->bi_size >> 9); + bio_sectors(wbio2)); generic_make_request(wbio2); } } @@ -2536,25 +2531,6 @@ static void fix_read_error(struct r10conf *conf, struct mddev *mddev, struct r10 } } -static void bi_complete(struct bio *bio, int error) -{ - complete((struct completion *)bio->bi_private); -} - -static int submit_bio_wait(int rw, struct bio *bio) -{ - struct completion event; - rw |= REQ_SYNC; - - init_completion(&event); - bio->bi_private = &event; - bio->bi_end_io = bi_complete; - submit_bio(rw, bio); - wait_for_completion(&event); - - return test_bit(BIO_UPTODATE, &bio->bi_flags); -} - static int narrow_write_error(struct r10bio *r10_bio, int i) { struct bio *bio = r10_bio->master_bio; @@ -2695,8 +2671,7 @@ read_more: r10_bio = mempool_alloc(conf->r10bio_pool, GFP_NOIO); r10_bio->master_bio = mbio; - r10_bio->sectors = (mbio->bi_size >> 9) - - sectors_handled; + r10_bio->sectors = bio_sectors(mbio) - sectors_handled; r10_bio->state = 0; set_bit(R10BIO_ReadError, &r10_bio->state); @@ -3133,6 +3108,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, } } bio = r10_bio->devs[0].bio; + bio_reset(bio); bio->bi_next = biolist; biolist = bio; bio->bi_private = r10_bio; @@ -3157,6 +3133,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, rdev = mirror->rdev; if (!test_bit(In_sync, &rdev->flags)) { bio = r10_bio->devs[1].bio; + bio_reset(bio); bio->bi_next = biolist; biolist = bio; bio->bi_private = r10_bio; @@ -3185,6 +3162,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, if (rdev == NULL || bio == NULL || test_bit(Faulty, &rdev->flags)) break; + bio_reset(bio); bio->bi_next = biolist; biolist = bio; bio->bi_private = r10_bio; @@ -3283,7 +3261,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, r10_bio->devs[i].repl_bio->bi_end_io = NULL; bio = r10_bio->devs[i].bio; - bio->bi_end_io = NULL; + bio_reset(bio); clear_bit(BIO_UPTODATE, &bio->bi_flags); if (conf->mirrors[d].rdev == NULL || test_bit(Faulty, &conf->mirrors[d].rdev->flags)) @@ -3320,6 +3298,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, /* Need to set up for writing to the replacement */ bio = r10_bio->devs[i].repl_bio; + bio_reset(bio); clear_bit(BIO_UPTODATE, &bio->bi_flags); sector = r10_bio->devs[i].addr; @@ -3353,17 +3332,6 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, } } - for (bio = biolist; bio ; bio=bio->bi_next) { - - bio->bi_flags &= ~(BIO_POOL_MASK - 1); - if (bio->bi_end_io) - bio->bi_flags |= 1 << BIO_UPTODATE; - bio->bi_vcnt = 0; - bio->bi_idx = 0; - bio->bi_phys_segments = 0; - bio->bi_size = 0; - } - nr_sectors = 0; if (sector_nr + max_sync < max_sector) max_sector = sector_nr + max_sync; @@ -4411,7 +4379,6 @@ read_more: read_bio->bi_flags &= ~(BIO_POOL_MASK - 1); read_bio->bi_flags |= 1 << BIO_UPTODATE; read_bio->bi_vcnt = 0; - read_bio->bi_idx = 0; read_bio->bi_size = 0; r10_bio->master_bio = read_bio; r10_bio->read_slot = r10_bio->devs[r10_bio->read_slot].devnum; @@ -4435,17 +4402,14 @@ read_more: } if (!rdev2 || test_bit(Faulty, &rdev2->flags)) continue; + + bio_reset(b); b->bi_bdev = rdev2->bdev; b->bi_sector = r10_bio->devs[s/2].addr + rdev2->new_data_offset; b->bi_private = r10_bio; b->bi_end_io = end_reshape_write; b->bi_rw = WRITE; - b->bi_flags &= ~(BIO_POOL_MASK - 1); - b->bi_flags |= 1 << BIO_UPTODATE; b->bi_next = blist; - b->bi_vcnt = 0; - b->bi_idx = 0; - b->bi_size = 0; blist = b; } diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 4a7be455d6d8..9359828ffe26 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -90,7 +90,7 @@ static inline struct hlist_head *stripe_hash(struct r5conf *conf, sector_t sect) */ static inline struct bio *r5_next_bio(struct bio *bio, sector_t sector) { - int sectors = bio->bi_size >> 9; + int sectors = bio_sectors(bio); if (bio->bi_sector + sectors < sector + STRIPE_SECTORS) return bio->bi_next; else @@ -569,14 +569,6 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s) bi = &sh->dev[i].req; rbi = &sh->dev[i].rreq; /* For writing to replacement */ - bi->bi_rw = rw; - rbi->bi_rw = rw; - if (rw & WRITE) { - bi->bi_end_io = raid5_end_write_request; - rbi->bi_end_io = raid5_end_write_request; - } else - bi->bi_end_io = raid5_end_read_request; - rcu_read_lock(); rrdev = rcu_dereference(conf->disks[i].replacement); smp_mb(); /* Ensure that if rrdev is NULL, rdev won't be */ @@ -651,7 +643,14 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s) set_bit(STRIPE_IO_STARTED, &sh->state); + bio_reset(bi); bi->bi_bdev = rdev->bdev; + bi->bi_rw = rw; + bi->bi_end_io = (rw & WRITE) + ? raid5_end_write_request + : raid5_end_read_request; + bi->bi_private = sh; + pr_debug("%s: for %llu schedule op %ld on disc %d\n", __func__, (unsigned long long)sh->sector, bi->bi_rw, i); @@ -665,12 +664,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s) if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags)) bi->bi_rw |= REQ_FLUSH; - bi->bi_flags = 1 << BIO_UPTODATE; - bi->bi_idx = 0; bi->bi_io_vec[0].bv_len = STRIPE_SIZE; bi->bi_io_vec[0].bv_offset = 0; bi->bi_size = STRIPE_SIZE; - bi->bi_next = NULL; if (rrdev) set_bit(R5_DOUBLE_LOCKED, &sh->dev[i].flags); @@ -687,7 +683,13 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s) set_bit(STRIPE_IO_STARTED, &sh->state); + bio_reset(rbi); rbi->bi_bdev = rrdev->bdev; + rbi->bi_rw = rw; + BUG_ON(!(rw & WRITE)); + rbi->bi_end_io = raid5_end_write_request; + rbi->bi_private = sh; + pr_debug("%s: for %llu schedule op %ld on " "replacement disc %d\n", __func__, (unsigned long long)sh->sector, @@ -699,12 +701,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s) else rbi->bi_sector = (sh->sector + rrdev->data_offset); - rbi->bi_flags = 1 << BIO_UPTODATE; - rbi->bi_idx = 0; rbi->bi_io_vec[0].bv_len = STRIPE_SIZE; rbi->bi_io_vec[0].bv_offset = 0; rbi->bi_size = STRIPE_SIZE; - rbi->bi_next = NULL; if (conf->mddev->gendisk) trace_block_bio_remap(bdev_get_queue(rbi->bi_bdev), rbi, disk_devt(conf->mddev->gendisk), @@ -2402,11 +2401,11 @@ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, in } else bip = &sh->dev[dd_idx].toread; while (*bip && (*bip)->bi_sector < bi->bi_sector) { - if ((*bip)->bi_sector + ((*bip)->bi_size >> 9) > bi->bi_sector) + if (bio_end_sector(*bip) > bi->bi_sector) goto overlap; bip = & (*bip)->bi_next; } - if (*bip && (*bip)->bi_sector < bi->bi_sector + ((bi->bi_size)>>9)) + if (*bip && (*bip)->bi_sector < bio_end_sector(bi)) goto overlap; BUG_ON(*bip && bi->bi_next && (*bip) != bi->bi_next); @@ -2422,8 +2421,8 @@ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, in sector < sh->dev[dd_idx].sector + STRIPE_SECTORS && bi && bi->bi_sector <= sector; bi = r5_next_bio(bi, sh->dev[dd_idx].sector)) { - if (bi->bi_sector + (bi->bi_size>>9) >= sector) - sector = bi->bi_sector + (bi->bi_size>>9); + if (bio_end_sector(bi) >= sector) + sector = bio_end_sector(bi); } if (sector >= sh->dev[dd_idx].sector + STRIPE_SECTORS) set_bit(R5_OVERWRITE, &sh->dev[dd_idx].flags); @@ -3849,7 +3848,7 @@ static int in_chunk_boundary(struct mddev *mddev, struct bio *bio) { sector_t sector = bio->bi_sector + get_start_sect(bio->bi_bdev); unsigned int chunk_sectors = mddev->chunk_sectors; - unsigned int bio_sectors = bio->bi_size >> 9; + unsigned int bio_sectors = bio_sectors(bio); if (mddev->new_chunk_sectors < mddev->chunk_sectors) chunk_sectors = mddev->new_chunk_sectors; @@ -3941,7 +3940,7 @@ static int bio_fits_rdev(struct bio *bi) { struct request_queue *q = bdev_get_queue(bi->bi_bdev); - if ((bi->bi_size>>9) > queue_max_sectors(q)) + if (bio_sectors(bi) > queue_max_sectors(q)) return 0; blk_recount_segments(q, bi); if (bi->bi_phys_segments > queue_max_segments(q)) @@ -3988,7 +3987,7 @@ static int chunk_aligned_read(struct mddev *mddev, struct bio * raid_bio) 0, &dd_idx, NULL); - end_sector = align_bi->bi_sector + (align_bi->bi_size >> 9); + end_sector = bio_end_sector(align_bi); rcu_read_lock(); rdev = rcu_dereference(conf->disks[dd_idx].replacement); if (!rdev || test_bit(Faulty, &rdev->flags) || @@ -4011,7 +4010,7 @@ static int chunk_aligned_read(struct mddev *mddev, struct bio * raid_bio) align_bi->bi_flags &= ~(1 << BIO_SEG_VALID); if (!bio_fits_rdev(align_bi) || - is_badblock(rdev, align_bi->bi_sector, align_bi->bi_size>>9, + is_badblock(rdev, align_bi->bi_sector, bio_sectors(align_bi), &first_bad, &bad_sectors)) { /* too big in some way, or has a known bad block */ bio_put(align_bi); @@ -4273,7 +4272,7 @@ static void make_request(struct mddev *mddev, struct bio * bi) } logical_sector = bi->bi_sector & ~((sector_t)STRIPE_SECTORS-1); - last_sector = bi->bi_sector + (bi->bi_size>>9); + last_sector = bio_end_sector(bi); bi->bi_next = NULL; bi->bi_phys_segments = 1; /* over-loaded to count active stripes */ @@ -4739,7 +4738,7 @@ static int retry_aligned_read(struct r5conf *conf, struct bio *raid_bio) logical_sector = raid_bio->bi_sector & ~((sector_t)STRIPE_SECTORS-1); sector = raid5_compute_sector(conf, logical_sector, 0, &dd_idx, NULL); - last_sector = raid_bio->bi_sector + (raid_bio->bi_size>>9); + last_sector = bio_end_sector(raid_bio); for (; logical_sector < last_sector; logical_sector += STRIPE_SECTORS, diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c index ffee6f781e30..dd239bdbfcb4 100644 --- a/drivers/message/fusion/mptsas.c +++ b/drivers/message/fusion/mptsas.c @@ -2235,10 +2235,10 @@ static int mptsas_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy, } /* do we need to support multiple segments? */ - if (req->bio->bi_vcnt > 1 || rsp->bio->bi_vcnt > 1) { + if (bio_segments(req->bio) > 1 || bio_segments(rsp->bio) > 1) { printk(MYIOC_s_ERR_FMT "%s: multiple segments req %u %u, rsp %u %u\n", - ioc->name, __func__, req->bio->bi_vcnt, blk_rq_bytes(req), - rsp->bio->bi_vcnt, blk_rq_bytes(rsp)); + ioc->name, __func__, bio_segments(req->bio), blk_rq_bytes(req), + bio_segments(rsp->bio), blk_rq_bytes(rsp)); return -EINVAL; } diff --git a/drivers/mtd/Kconfig b/drivers/mtd/Kconfig index 557bec599f4f..5fab4e6e8301 100644 --- a/drivers/mtd/Kconfig +++ b/drivers/mtd/Kconfig @@ -157,19 +157,6 @@ config MTD_BCM47XX_PARTS comment "User Modules And Translation Layers" -config MTD_CHAR - tristate "Direct char device access to MTD devices" - help - This provides a character device for each MTD device present in - the system, allowing the user to read and write directly to the - memory chips, and also use ioctl() to obtain information about - the device, or to erase parts of it. - -config HAVE_MTD_OTP - bool - help - Enable access to OTP regions using MTD_CHAR. - config MTD_BLKDEVS tristate "Common interface to block layer for MTD 'translation layers'" depends on BLOCK diff --git a/drivers/mtd/Makefile b/drivers/mtd/Makefile index 18a38e55b2f0..4cfb31e6c966 100644 --- a/drivers/mtd/Makefile +++ b/drivers/mtd/Makefile @@ -4,7 +4,7 @@ # Core functionality. obj-$(CONFIG_MTD) += mtd.o -mtd-y := mtdcore.o mtdsuper.o mtdconcat.o mtdpart.o +mtd-y := mtdcore.o mtdsuper.o mtdconcat.o mtdpart.o mtdchar.o obj-$(CONFIG_MTD_OF_PARTS) += ofpart.o obj-$(CONFIG_MTD_REDBOOT_PARTS) += redboot.o @@ -15,7 +15,6 @@ obj-$(CONFIG_MTD_BCM63XX_PARTS) += bcm63xxpart.o obj-$(CONFIG_MTD_BCM47XX_PARTS) += bcm47xxpart.o # 'Users' - code which presents functionality to userspace. -obj-$(CONFIG_MTD_CHAR) += mtdchar.o obj-$(CONFIG_MTD_BLKDEVS) += mtd_blkdevs.o obj-$(CONFIG_MTD_BLOCK) += mtdblock.o obj-$(CONFIG_MTD_BLOCK_RO) += mtdblock_ro.o diff --git a/drivers/mtd/chips/Kconfig b/drivers/mtd/chips/Kconfig index c219e3d098d9..e4696b37f3de 100644 --- a/drivers/mtd/chips/Kconfig +++ b/drivers/mtd/chips/Kconfig @@ -146,7 +146,6 @@ config MTD_CFI_I8 config MTD_OTP bool "Protection Registers aka one-time programmable (OTP) bits" depends on MTD_CFI_ADV_OPTIONS - select HAVE_MTD_OTP default n help This enables support for reading, writing and locking so called diff --git a/drivers/mtd/devices/Kconfig b/drivers/mtd/devices/Kconfig index 12311f506ca1..2a4d55e4b362 100644 --- a/drivers/mtd/devices/Kconfig +++ b/drivers/mtd/devices/Kconfig @@ -71,7 +71,6 @@ config MTD_DATAFLASH_WRITE_VERIFY config MTD_DATAFLASH_OTP bool "DataFlash OTP support (Security Register)" depends on MTD_DATAFLASH - select HAVE_MTD_OTP help Newer DataFlash chips (revisions C and D) support 128 bytes of one-time-programmable (OTP) data. The first half may be written @@ -205,69 +204,6 @@ config MTD_BLOCK2MTD comment "Disk-On-Chip Device Drivers" -config MTD_DOC2000 - tristate "M-Systems Disk-On-Chip 2000 and Millennium (DEPRECATED)" - depends on MTD_NAND - select MTD_DOCPROBE - select MTD_NAND_IDS - ---help--- - This provides an MTD device driver for the M-Systems DiskOnChip - 2000 and Millennium devices. Originally designed for the DiskOnChip - 2000, it also now includes support for the DiskOnChip Millennium. - If you have problems with this driver and the DiskOnChip Millennium, - you may wish to try the alternative Millennium driver below. To use - the alternative driver, you will need to undefine DOC_SINGLE_DRIVER - in the source code. - - If you use this device, you probably also want to enable the NFTL - 'NAND Flash Translation Layer' option below, which is used to - emulate a block device by using a kind of file system on the flash - chips. - - NOTE: This driver is deprecated and will probably be removed soon. - Please try the new DiskOnChip driver under "NAND Flash Device - Drivers". - -config MTD_DOC2001 - tristate "M-Systems Disk-On-Chip Millennium-only alternative driver (DEPRECATED)" - depends on MTD_NAND - select MTD_DOCPROBE - select MTD_NAND_IDS - ---help--- - This provides an alternative MTD device driver for the M-Systems - DiskOnChip Millennium devices. Use this if you have problems with - the combined DiskOnChip 2000 and Millennium driver above. To get - the DiskOnChip probe code to load and use this driver instead of - the other one, you will need to undefine DOC_SINGLE_DRIVER near - the beginning of . - - If you use this device, you probably also want to enable the NFTL - 'NAND Flash Translation Layer' option below, which is used to - emulate a block device by using a kind of file system on the flash - chips. - - NOTE: This driver is deprecated and will probably be removed soon. - Please try the new DiskOnChip driver under "NAND Flash Device - Drivers". - -config MTD_DOC2001PLUS - tristate "M-Systems Disk-On-Chip Millennium Plus" - depends on MTD_NAND - select MTD_DOCPROBE - select MTD_NAND_IDS - ---help--- - This provides an MTD device driver for the M-Systems DiskOnChip - Millennium Plus devices. - - If you use this device, you probably also want to enable the INFTL - 'Inverse NAND Flash Translation Layer' option below, which is used - to emulate a block device by using a kind of file system on the - flash chips. - - NOTE: This driver will soon be replaced by the new DiskOnChip driver - under "NAND Flash Device Drivers" (currently that driver does not - support all Millennium Plus devices). - config MTD_DOCG3 tristate "M-Systems Disk-On-Chip G3" select BCH diff --git a/drivers/mtd/devices/Makefile b/drivers/mtd/devices/Makefile index 369a1943ca25..d83bd73096f6 100644 --- a/drivers/mtd/devices/Makefile +++ b/drivers/mtd/devices/Makefile @@ -2,12 +2,7 @@ # linux/drivers/mtd/devices/Makefile # -obj-$(CONFIG_MTD_DOC2000) += doc2000.o -obj-$(CONFIG_MTD_DOC2001) += doc2001.o -obj-$(CONFIG_MTD_DOC2001PLUS) += doc2001plus.o obj-$(CONFIG_MTD_DOCG3) += docg3.o -obj-$(CONFIG_MTD_DOCPROBE) += docprobe.o -obj-$(CONFIG_MTD_DOCECC) += docecc.o obj-$(CONFIG_MTD_SLRAM) += slram.o obj-$(CONFIG_MTD_PHRAM) += phram.o obj-$(CONFIG_MTD_PMC551) += pmc551.o diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c index 95266285acb1..18e7761137a3 100644 --- a/drivers/mtd/devices/bcm47xxsflash.c +++ b/drivers/mtd/devices/bcm47xxsflash.c @@ -10,7 +10,7 @@ MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Serial flash driver for BCMA bus"); -static const char *probes[] = { "bcm47xxpart", NULL }; +static const char * const probes[] = { "bcm47xxpart", NULL }; static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf) @@ -61,6 +61,17 @@ static int bcm47xxsflash_bcma_probe(struct platform_device *pdev) } sflash->priv = b47s; + b47s->bcma_cc = container_of(sflash, struct bcma_drv_cc, sflash); + + switch (b47s->bcma_cc->capabilities & BCMA_CC_CAP_FLASHT) { + case BCMA_CC_FLASHT_STSER: + b47s->type = BCM47XXSFLASH_TYPE_ST; + break; + case BCMA_CC_FLASHT_ATSER: + b47s->type = BCM47XXSFLASH_TYPE_ATMEL; + break; + } + b47s->window = sflash->window; b47s->blocksize = sflash->blocksize; b47s->numblocks = sflash->numblocks; diff --git a/drivers/mtd/devices/bcm47xxsflash.h b/drivers/mtd/devices/bcm47xxsflash.h index ebf6f710e23c..f22f8c46dfc0 100644 --- a/drivers/mtd/devices/bcm47xxsflash.h +++ b/drivers/mtd/devices/bcm47xxsflash.h @@ -3,7 +3,66 @@ #include +/* Used for ST flashes only. */ +#define OPCODE_ST_WREN 0x0006 /* Write Enable */ +#define OPCODE_ST_WRDIS 0x0004 /* Write Disable */ +#define OPCODE_ST_RDSR 0x0105 /* Read Status Register */ +#define OPCODE_ST_WRSR 0x0101 /* Write Status Register */ +#define OPCODE_ST_READ 0x0303 /* Read Data Bytes */ +#define OPCODE_ST_PP 0x0302 /* Page Program */ +#define OPCODE_ST_SE 0x02d8 /* Sector Erase */ +#define OPCODE_ST_BE 0x00c7 /* Bulk Erase */ +#define OPCODE_ST_DP 0x00b9 /* Deep Power-down */ +#define OPCODE_ST_RES 0x03ab /* Read Electronic Signature */ +#define OPCODE_ST_CSA 0x1000 /* Keep chip select asserted */ +#define OPCODE_ST_SSE 0x0220 /* Sub-sector Erase */ + +/* Used for Atmel flashes only. */ +#define OPCODE_AT_READ 0x07e8 +#define OPCODE_AT_PAGE_READ 0x07d2 +#define OPCODE_AT_STATUS 0x01d7 +#define OPCODE_AT_BUF1_WRITE 0x0384 +#define OPCODE_AT_BUF2_WRITE 0x0387 +#define OPCODE_AT_BUF1_ERASE_PROGRAM 0x0283 +#define OPCODE_AT_BUF2_ERASE_PROGRAM 0x0286 +#define OPCODE_AT_BUF1_PROGRAM 0x0288 +#define OPCODE_AT_BUF2_PROGRAM 0x0289 +#define OPCODE_AT_PAGE_ERASE 0x0281 +#define OPCODE_AT_BLOCK_ERASE 0x0250 +#define OPCODE_AT_BUF1_WRITE_ERASE_PROGRAM 0x0382 +#define OPCODE_AT_BUF2_WRITE_ERASE_PROGRAM 0x0385 +#define OPCODE_AT_BUF1_LOAD 0x0253 +#define OPCODE_AT_BUF2_LOAD 0x0255 +#define OPCODE_AT_BUF1_COMPARE 0x0260 +#define OPCODE_AT_BUF2_COMPARE 0x0261 +#define OPCODE_AT_BUF1_REPROGRAM 0x0258 +#define OPCODE_AT_BUF2_REPROGRAM 0x0259 + +/* Status register bits for ST flashes */ +#define SR_ST_WIP 0x01 /* Write In Progress */ +#define SR_ST_WEL 0x02 /* Write Enable Latch */ +#define SR_ST_BP_MASK 0x1c /* Block Protect */ +#define SR_ST_BP_SHIFT 2 +#define SR_ST_SRWD 0x80 /* Status Register Write Disable */ + +/* Status register bits for Atmel flashes */ +#define SR_AT_READY 0x80 +#define SR_AT_MISMATCH 0x40 +#define SR_AT_ID_MASK 0x38 +#define SR_AT_ID_SHIFT 3 + +struct bcma_drv_cc; + +enum bcm47xxsflash_type { + BCM47XXSFLASH_TYPE_ATMEL, + BCM47XXSFLASH_TYPE_ST, +}; + struct bcm47xxsflash { + struct bcma_drv_cc *bcma_cc; + + enum bcm47xxsflash_type type; + u32 window; u32 blocksize; u16 numblocks; diff --git a/drivers/mtd/devices/doc2000.c b/drivers/mtd/devices/doc2000.c deleted file mode 100644 index a4eb8b5b85ec..000000000000 --- a/drivers/mtd/devices/doc2000.c +++ /dev/null @@ -1,1178 +0,0 @@ - -/* - * Linux driver for Disk-On-Chip 2000 and Millennium - * (c) 1999 Machine Vision Holdings, Inc. - * (c) 1999, 2000 David Woodhouse - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include - -#define DOC_SUPPORT_2000 -#define DOC_SUPPORT_2000TSOP -#define DOC_SUPPORT_MILLENNIUM - -#ifdef DOC_SUPPORT_2000 -#define DoC_is_2000(doc) (doc->ChipID == DOC_ChipID_Doc2k) -#else -#define DoC_is_2000(doc) (0) -#endif - -#if defined(DOC_SUPPORT_2000TSOP) || defined(DOC_SUPPORT_MILLENNIUM) -#define DoC_is_Millennium(doc) (doc->ChipID == DOC_ChipID_DocMil) -#else -#define DoC_is_Millennium(doc) (0) -#endif - -/* #define ECC_DEBUG */ - -/* I have no idea why some DoC chips can not use memcpy_from|to_io(). - * This may be due to the different revisions of the ASIC controller built-in or - * simplily a QA/Bug issue. Who knows ?? If you have trouble, please uncomment - * this: - #undef USE_MEMCPY -*/ - -static int doc_read(struct mtd_info *mtd, loff_t from, size_t len, - size_t *retlen, u_char *buf); -static int doc_write(struct mtd_info *mtd, loff_t to, size_t len, - size_t *retlen, const u_char *buf); -static int doc_read_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops); -static int doc_write_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops); -static int doc_write_oob_nolock(struct mtd_info *mtd, loff_t ofs, size_t len, - size_t *retlen, const u_char *buf); -static int doc_erase (struct mtd_info *mtd, struct erase_info *instr); - -static struct mtd_info *doc2klist = NULL; - -/* Perform the required delay cycles by reading from the appropriate register */ -static void DoC_Delay(struct DiskOnChip *doc, unsigned short cycles) -{ - volatile char dummy; - int i; - - for (i = 0; i < cycles; i++) { - if (DoC_is_Millennium(doc)) - dummy = ReadDOC(doc->virtadr, NOP); - else - dummy = ReadDOC(doc->virtadr, DOCStatus); - } - -} - -/* DOC_WaitReady: Wait for RDY line to be asserted by the flash chip */ -static int _DoC_WaitReady(struct DiskOnChip *doc) -{ - void __iomem *docptr = doc->virtadr; - unsigned long timeo = jiffies + (HZ * 10); - - pr_debug("_DoC_WaitReady called for out-of-line wait\n"); - - /* Out-of-line routine to wait for chip response */ - while (!(ReadDOC(docptr, CDSNControl) & CDSN_CTRL_FR_B)) { - /* issue 2 read from NOP register after reading from CDSNControl register - see Software Requirement 11.4 item 2. */ - DoC_Delay(doc, 2); - - if (time_after(jiffies, timeo)) { - pr_debug("_DoC_WaitReady timed out.\n"); - return -EIO; - } - udelay(1); - cond_resched(); - } - - return 0; -} - -static inline int DoC_WaitReady(struct DiskOnChip *doc) -{ - void __iomem *docptr = doc->virtadr; - - /* This is inline, to optimise the common case, where it's ready instantly */ - int ret = 0; - - /* 4 read form NOP register should be issued in prior to the read from CDSNControl - see Software Requirement 11.4 item 2. */ - DoC_Delay(doc, 4); - - if (!(ReadDOC(docptr, CDSNControl) & CDSN_CTRL_FR_B)) - /* Call the out-of-line routine to wait */ - ret = _DoC_WaitReady(doc); - - /* issue 2 read from NOP register after reading from CDSNControl register - see Software Requirement 11.4 item 2. */ - DoC_Delay(doc, 2); - - return ret; -} - -/* DoC_Command: Send a flash command to the flash chip through the CDSN Slow IO register to - bypass the internal pipeline. Each of 4 delay cycles (read from the NOP register) is - required after writing to CDSN Control register, see Software Requirement 11.4 item 3. */ - -static int DoC_Command(struct DiskOnChip *doc, unsigned char command, - unsigned char xtraflags) -{ - void __iomem *docptr = doc->virtadr; - - if (DoC_is_2000(doc)) - xtraflags |= CDSN_CTRL_FLASH_IO; - - /* Assert the CLE (Command Latch Enable) line to the flash chip */ - WriteDOC(xtraflags | CDSN_CTRL_CLE | CDSN_CTRL_CE, docptr, CDSNControl); - DoC_Delay(doc, 4); /* Software requirement 11.4.3 for Millennium */ - - if (DoC_is_Millennium(doc)) - WriteDOC(command, docptr, CDSNSlowIO); - - /* Send the command */ - WriteDOC_(command, docptr, doc->ioreg); - if (DoC_is_Millennium(doc)) - WriteDOC(command, docptr, WritePipeTerm); - - /* Lower the CLE line */ - WriteDOC(xtraflags | CDSN_CTRL_CE, docptr, CDSNControl); - DoC_Delay(doc, 4); /* Software requirement 11.4.3 for Millennium */ - - /* Wait for the chip to respond - Software requirement 11.4.1 (extended for any command) */ - return DoC_WaitReady(doc); -} - -/* DoC_Address: Set the current address for the flash chip through the CDSN Slow IO register to - bypass the internal pipeline. Each of 4 delay cycles (read from the NOP register) is - required after writing to CDSN Control register, see Software Requirement 11.4 item 3. */ - -static int DoC_Address(struct DiskOnChip *doc, int numbytes, unsigned long ofs, - unsigned char xtraflags1, unsigned char xtraflags2) -{ - int i; - void __iomem *docptr = doc->virtadr; - - if (DoC_is_2000(doc)) - xtraflags1 |= CDSN_CTRL_FLASH_IO; - - /* Assert the ALE (Address Latch Enable) line to the flash chip */ - WriteDOC(xtraflags1 | CDSN_CTRL_ALE | CDSN_CTRL_CE, docptr, CDSNControl); - - DoC_Delay(doc, 4); /* Software requirement 11.4.3 for Millennium */ - - /* Send the address */ - /* Devices with 256-byte page are addressed as: - Column (bits 0-7), Page (bits 8-15, 16-23, 24-31) - * there is no device on the market with page256 - and more than 24 bits. - Devices with 512-byte page are addressed as: - Column (bits 0-7), Page (bits 9-16, 17-24, 25-31) - * 25-31 is sent only if the chip support it. - * bit 8 changes the read command to be sent - (NAND_CMD_READ0 or NAND_CMD_READ1). - */ - - if (numbytes == ADDR_COLUMN || numbytes == ADDR_COLUMN_PAGE) { - if (DoC_is_Millennium(doc)) - WriteDOC(ofs & 0xff, docptr, CDSNSlowIO); - WriteDOC_(ofs & 0xff, docptr, doc->ioreg); - } - - if (doc->page256) { - ofs = ofs >> 8; - } else { - ofs = ofs >> 9; - } - - if (numbytes == ADDR_PAGE || numbytes == ADDR_COLUMN_PAGE) { - for (i = 0; i < doc->pageadrlen; i++, ofs = ofs >> 8) { - if (DoC_is_Millennium(doc)) - WriteDOC(ofs & 0xff, docptr, CDSNSlowIO); - WriteDOC_(ofs & 0xff, docptr, doc->ioreg); - } - } - - if (DoC_is_Millennium(doc)) - WriteDOC(ofs & 0xff, docptr, WritePipeTerm); - - DoC_Delay(doc, 2); /* Needed for some slow flash chips. mf. */ - - /* FIXME: The SlowIO's for millennium could be replaced by - a single WritePipeTerm here. mf. */ - - /* Lower the ALE line */ - WriteDOC(xtraflags1 | xtraflags2 | CDSN_CTRL_CE, docptr, - CDSNControl); - - DoC_Delay(doc, 4); /* Software requirement 11.4.3 for Millennium */ - - /* Wait for the chip to respond - Software requirement 11.4.1 */ - return DoC_WaitReady(doc); -} - -/* Read a buffer from DoC, taking care of Millennium odditys */ -static void DoC_ReadBuf(struct DiskOnChip *doc, u_char * buf, int len) -{ - volatile int dummy; - int modulus = 0xffff; - void __iomem *docptr = doc->virtadr; - int i; - - if (len <= 0) - return; - - if (DoC_is_Millennium(doc)) { - /* Read the data via the internal pipeline through CDSN IO register, - see Pipelined Read Operations 11.3 */ - dummy = ReadDOC(docptr, ReadPipeInit); - - /* Millennium should use the LastDataRead register - Pipeline Reads */ - len--; - - /* This is needed for correctly ECC calculation */ - modulus = 0xff; - } - - for (i = 0; i < len; i++) - buf[i] = ReadDOC_(docptr, doc->ioreg + (i & modulus)); - - if (DoC_is_Millennium(doc)) { - buf[i] = ReadDOC(docptr, LastDataRead); - } -} - -/* Write a buffer to DoC, taking care of Millennium odditys */ -static void DoC_WriteBuf(struct DiskOnChip *doc, const u_char * buf, int len) -{ - void __iomem *docptr = doc->virtadr; - int i; - - if (len <= 0) - return; - - for (i = 0; i < len; i++) - WriteDOC_(buf[i], docptr, doc->ioreg + i); - - if (DoC_is_Millennium(doc)) { - WriteDOC(0x00, docptr, WritePipeTerm); - } -} - - -/* DoC_SelectChip: Select a given flash chip within the current floor */ - -static inline int DoC_SelectChip(struct DiskOnChip *doc, int chip) -{ - void __iomem *docptr = doc->virtadr; - - /* Software requirement 11.4.4 before writing DeviceSelect */ - /* Deassert the CE line to eliminate glitches on the FCE# outputs */ - WriteDOC(CDSN_CTRL_WP, docptr, CDSNControl); - DoC_Delay(doc, 4); /* Software requirement 11.4.3 for Millennium */ - - /* Select the individual flash chip requested */ - WriteDOC(chip, docptr, CDSNDeviceSelect); - DoC_Delay(doc, 4); - - /* Reassert the CE line */ - WriteDOC(CDSN_CTRL_CE | CDSN_CTRL_FLASH_IO | CDSN_CTRL_WP, docptr, - CDSNControl); - DoC_Delay(doc, 4); /* Software requirement 11.4.3 for Millennium */ - - /* Wait for it to be ready */ - return DoC_WaitReady(doc); -} - -/* DoC_SelectFloor: Select a given floor (bank of flash chips) */ - -static inline int DoC_SelectFloor(struct DiskOnChip *doc, int floor) -{ - void __iomem *docptr = doc->virtadr; - - /* Select the floor (bank) of chips required */ - WriteDOC(floor, docptr, FloorSelect); - - /* Wait for the chip to be ready */ - return DoC_WaitReady(doc); -} - -/* DoC_IdentChip: Identify a given NAND chip given {floor,chip} */ - -static int DoC_IdentChip(struct DiskOnChip *doc, int floor, int chip) -{ - int mfr, id, i, j; - volatile char dummy; - - /* Page in the required floor/chip */ - DoC_SelectFloor(doc, floor); - DoC_SelectChip(doc, chip); - - /* Reset the chip */ - if (DoC_Command(doc, NAND_CMD_RESET, CDSN_CTRL_WP)) { - pr_debug("DoC_Command (reset) for %d,%d returned true\n", - floor, chip); - return 0; - } - - - /* Read the NAND chip ID: 1. Send ReadID command */ - if (DoC_Command(doc, NAND_CMD_READID, CDSN_CTRL_WP)) { - pr_debug("DoC_Command (ReadID) for %d,%d returned true\n", - floor, chip); - return 0; - } - - /* Read the NAND chip ID: 2. Send address byte zero */ - DoC_Address(doc, ADDR_COLUMN, 0, CDSN_CTRL_WP, 0); - - /* Read the manufacturer and device id codes from the device */ - - if (DoC_is_Millennium(doc)) { - DoC_Delay(doc, 2); - dummy = ReadDOC(doc->virtadr, ReadPipeInit); - mfr = ReadDOC(doc->virtadr, LastDataRead); - - DoC_Delay(doc, 2); - dummy = ReadDOC(doc->virtadr, ReadPipeInit); - id = ReadDOC(doc->virtadr, LastDataRead); - } else { - /* CDSN Slow IO register see Software Req 11.4 item 5. */ - dummy = ReadDOC(doc->virtadr, CDSNSlowIO); - DoC_Delay(doc, 2); - mfr = ReadDOC_(doc->virtadr, doc->ioreg); - - /* CDSN Slow IO register see Software Req 11.4 item 5. */ - dummy = ReadDOC(doc->virtadr, CDSNSlowIO); - DoC_Delay(doc, 2); - id = ReadDOC_(doc->virtadr, doc->ioreg); - } - - /* No response - return failure */ - if (mfr == 0xff || mfr == 0) - return 0; - - /* Check it's the same as the first chip we identified. - * M-Systems say that any given DiskOnChip device should only - * contain _one_ type of flash part, although that's not a - * hardware restriction. */ - if (doc->mfr) { - if (doc->mfr == mfr && doc->id == id) - return 1; /* This is the same as the first */ - else - printk(KERN_WARNING - "Flash chip at floor %d, chip %d is different:\n", - floor, chip); - } - - /* Print and store the manufacturer and ID codes. */ - for (i = 0; nand_flash_ids[i].name != NULL; i++) { - if (id == nand_flash_ids[i].id) { - /* Try to identify manufacturer */ - for (j = 0; nand_manuf_ids[j].id != 0x0; j++) { - if (nand_manuf_ids[j].id == mfr) - break; - } - printk(KERN_INFO - "Flash chip found: Manufacturer ID: %2.2X, " - "Chip ID: %2.2X (%s:%s)\n", mfr, id, - nand_manuf_ids[j].name, nand_flash_ids[i].name); - if (!doc->mfr) { - doc->mfr = mfr; - doc->id = id; - doc->chipshift = - ffs((nand_flash_ids[i].chipsize << 20)) - 1; - doc->page256 = (nand_flash_ids[i].pagesize == 256) ? 1 : 0; - doc->pageadrlen = doc->chipshift > 25 ? 3 : 2; - doc->erasesize = - nand_flash_ids[i].erasesize; - return 1; - } - return 0; - } - } - - - /* We haven't fully identified the chip. Print as much as we know. */ - printk(KERN_WARNING "Unknown flash chip found: %2.2X %2.2X\n", - id, mfr); - - printk(KERN_WARNING "Please report to dwmw2@infradead.org\n"); - return 0; -} - -/* DoC_ScanChips: Find all NAND chips present in a DiskOnChip, and identify them */ - -static void DoC_ScanChips(struct DiskOnChip *this, int maxchips) -{ - int floor, chip; - int numchips[MAX_FLOORS]; - int ret = 1; - - this->numchips = 0; - this->mfr = 0; - this->id = 0; - - /* For each floor, find the number of valid chips it contains */ - for (floor = 0; floor < MAX_FLOORS; floor++) { - ret = 1; - numchips[floor] = 0; - for (chip = 0; chip < maxchips && ret != 0; chip++) { - - ret = DoC_IdentChip(this, floor, chip); - if (ret) { - numchips[floor]++; - this->numchips++; - } - } - } - - /* If there are none at all that we recognise, bail */ - if (!this->numchips) { - printk(KERN_NOTICE "No flash chips recognised.\n"); - return; - } - - /* Allocate an array to hold the information for each chip */ - this->chips = kmalloc(sizeof(struct Nand) * this->numchips, GFP_KERNEL); - if (!this->chips) { - printk(KERN_NOTICE "No memory for allocating chip info structures\n"); - return; - } - - ret = 0; - - /* Fill out the chip array with {floor, chipno} for each - * detected chip in the device. */ - for (floor = 0; floor < MAX_FLOORS; floor++) { - for (chip = 0; chip < numchips[floor]; chip++) { - this->chips[ret].floor = floor; - this->chips[ret].chip = chip; - this->chips[ret].curadr = 0; - this->chips[ret].curmode = 0x50; - ret++; - } - } - - /* Calculate and print the total size of the device */ - this->totlen = this->numchips * (1 << this->chipshift); - - printk(KERN_INFO "%d flash chips found. Total DiskOnChip size: %ld MiB\n", - this->numchips, this->totlen >> 20); -} - -static int DoC2k_is_alias(struct DiskOnChip *doc1, struct DiskOnChip *doc2) -{ - int tmp1, tmp2, retval; - if (doc1->physadr == doc2->physadr) - return 1; - - /* Use the alias resolution register which was set aside for this - * purpose. If it's value is the same on both chips, they might - * be the same chip, and we write to one and check for a change in - * the other. It's unclear if this register is usuable in the - * DoC 2000 (it's in the Millennium docs), but it seems to work. */ - tmp1 = ReadDOC(doc1->virtadr, AliasResolution); - tmp2 = ReadDOC(doc2->virtadr, AliasResolution); - if (tmp1 != tmp2) - return 0; - - WriteDOC((tmp1 + 1) % 0xff, doc1->virtadr, AliasResolution); - tmp2 = ReadDOC(doc2->virtadr, AliasResolution); - if (tmp2 == (tmp1 + 1) % 0xff) - retval = 1; - else - retval = 0; - - /* Restore register contents. May not be necessary, but do it just to - * be safe. */ - WriteDOC(tmp1, doc1->virtadr, AliasResolution); - - return retval; -} - -/* This routine is found from the docprobe code by symbol_get(), - * which will bump the use count of this module. */ -void DoC2k_init(struct mtd_info *mtd) -{ - struct DiskOnChip *this = mtd->priv; - struct DiskOnChip *old = NULL; - int maxchips; - - /* We must avoid being called twice for the same device. */ - - if (doc2klist) - old = doc2klist->priv; - - while (old) { - if (DoC2k_is_alias(old, this)) { - printk(KERN_NOTICE - "Ignoring DiskOnChip 2000 at 0x%lX - already configured\n", - this->physadr); - iounmap(this->virtadr); - kfree(mtd); - return; - } - if (old->nextdoc) - old = old->nextdoc->priv; - else - old = NULL; - } - - - switch (this->ChipID) { - case DOC_ChipID_Doc2kTSOP: - mtd->name = "DiskOnChip 2000 TSOP"; - this->ioreg = DoC_Mil_CDSN_IO; - /* Pretend it's a Millennium */ - this->ChipID = DOC_ChipID_DocMil; - maxchips = MAX_CHIPS; - break; - case DOC_ChipID_Doc2k: - mtd->name = "DiskOnChip 2000"; - this->ioreg = DoC_2k_CDSN_IO; - maxchips = MAX_CHIPS; - break; - case DOC_ChipID_DocMil: - mtd->name = "DiskOnChip Millennium"; - this->ioreg = DoC_Mil_CDSN_IO; - maxchips = MAX_CHIPS_MIL; - break; - default: - printk("Unknown ChipID 0x%02x\n", this->ChipID); - kfree(mtd); - iounmap(this->virtadr); - return; - } - - printk(KERN_NOTICE "%s found at address 0x%lX\n", mtd->name, - this->physadr); - - mtd->type = MTD_NANDFLASH; - mtd->flags = MTD_CAP_NANDFLASH; - mtd->writebufsize = mtd->writesize = 512; - mtd->oobsize = 16; - mtd->ecc_strength = 2; - mtd->owner = THIS_MODULE; - mtd->_erase = doc_erase; - mtd->_read = doc_read; - mtd->_write = doc_write; - mtd->_read_oob = doc_read_oob; - mtd->_write_oob = doc_write_oob; - this->curfloor = -1; - this->curchip = -1; - mutex_init(&this->lock); - - /* Ident all the chips present. */ - DoC_ScanChips(this, maxchips); - - if (!this->totlen) { - kfree(mtd); - iounmap(this->virtadr); - } else { - this->nextdoc = doc2klist; - doc2klist = mtd; - mtd->size = this->totlen; - mtd->erasesize = this->erasesize; - mtd_device_register(mtd, NULL, 0); - return; - } -} -EXPORT_SYMBOL_GPL(DoC2k_init); - -static int doc_read(struct mtd_info *mtd, loff_t from, size_t len, - size_t * retlen, u_char * buf) -{ - struct DiskOnChip *this = mtd->priv; - void __iomem *docptr = this->virtadr; - struct Nand *mychip; - unsigned char syndrome[6], eccbuf[6]; - volatile char dummy; - int i, len256 = 0, ret=0; - size_t left = len; - - mutex_lock(&this->lock); - while (left) { - len = left; - - /* Don't allow a single read to cross a 512-byte block boundary */ - if (from + len > ((from | 0x1ff) + 1)) - len = ((from | 0x1ff) + 1) - from; - - /* The ECC will not be calculated correctly if less than 512 is read */ - if (len != 0x200) - printk(KERN_WARNING - "ECC needs a full sector read (adr: %lx size %lx)\n", - (long) from, (long) len); - - /* printk("DoC_Read (adr: %lx size %lx)\n", (long) from, (long) len); */ - - - /* Find the chip which is to be used and select it */ - mychip = &this->chips[from >> (this->chipshift)]; - - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(this, mychip->floor); - DoC_SelectChip(this, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(this, mychip->chip); - } - - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - DoC_Command(this, - (!this->page256 - && (from & 0x100)) ? NAND_CMD_READ1 : NAND_CMD_READ0, - CDSN_CTRL_WP); - DoC_Address(this, ADDR_COLUMN_PAGE, from, CDSN_CTRL_WP, - CDSN_CTRL_ECC_IO); - - /* Prime the ECC engine */ - WriteDOC(DOC_ECC_RESET, docptr, ECCConf); - WriteDOC(DOC_ECC_EN, docptr, ECCConf); - - /* treat crossing 256-byte sector for 2M x 8bits devices */ - if (this->page256 && from + len > (from | 0xff) + 1) { - len256 = (from | 0xff) + 1 - from; - DoC_ReadBuf(this, buf, len256); - - DoC_Command(this, NAND_CMD_READ0, CDSN_CTRL_WP); - DoC_Address(this, ADDR_COLUMN_PAGE, from + len256, - CDSN_CTRL_WP, CDSN_CTRL_ECC_IO); - } - - DoC_ReadBuf(this, &buf[len256], len - len256); - - /* Let the caller know we completed it */ - *retlen += len; - - /* Read the ECC data through the DiskOnChip ECC logic */ - /* Note: this will work even with 2M x 8bit devices as */ - /* they have 8 bytes of OOB per 256 page. mf. */ - DoC_ReadBuf(this, eccbuf, 6); - - /* Flush the pipeline */ - if (DoC_is_Millennium(this)) { - dummy = ReadDOC(docptr, ECCConf); - dummy = ReadDOC(docptr, ECCConf); - i = ReadDOC(docptr, ECCConf); - } else { - dummy = ReadDOC(docptr, 2k_ECCStatus); - dummy = ReadDOC(docptr, 2k_ECCStatus); - i = ReadDOC(docptr, 2k_ECCStatus); - } - - /* Check the ECC Status */ - if (i & 0x80) { - int nb_errors; - /* There was an ECC error */ -#ifdef ECC_DEBUG - printk(KERN_ERR "DiskOnChip ECC Error: Read at %lx\n", (long)from); -#endif - /* Read the ECC syndrome through the DiskOnChip ECC - logic. These syndrome will be all ZERO when there - is no error */ - for (i = 0; i < 6; i++) { - syndrome[i] = - ReadDOC(docptr, ECCSyndrome0 + i); - } - nb_errors = doc_decode_ecc(buf, syndrome); - -#ifdef ECC_DEBUG - printk(KERN_ERR "Errors corrected: %x\n", nb_errors); -#endif - if (nb_errors < 0) { - /* We return error, but have actually done the - read. Not that this can be told to - user-space, via sys_read(), but at least - MTD-aware stuff can know about it by - checking *retlen */ - ret = -EIO; - } - } - -#ifdef PSYCHO_DEBUG - printk(KERN_DEBUG "ECC DATA at %lxB: %2.2X %2.2X %2.2X %2.2X %2.2X %2.2X\n", - (long)from, eccbuf[0], eccbuf[1], eccbuf[2], - eccbuf[3], eccbuf[4], eccbuf[5]); -#endif - - /* disable the ECC engine */ - WriteDOC(DOC_ECC_DIS, docptr , ECCConf); - - /* according to 11.4.1, we need to wait for the busy line - * drop if we read to the end of the page. */ - if(0 == ((from + len) & 0x1ff)) - { - DoC_WaitReady(this); - } - - from += len; - left -= len; - buf += len; - } - - mutex_unlock(&this->lock); - - return ret; -} - -static int doc_write(struct mtd_info *mtd, loff_t to, size_t len, - size_t * retlen, const u_char * buf) -{ - struct DiskOnChip *this = mtd->priv; - int di; /* Yes, DI is a hangover from when I was disassembling the binary driver */ - void __iomem *docptr = this->virtadr; - unsigned char eccbuf[6]; - volatile char dummy; - int len256 = 0; - struct Nand *mychip; - size_t left = len; - int status; - - mutex_lock(&this->lock); - while (left) { - len = left; - - /* Don't allow a single write to cross a 512-byte block boundary */ - if (to + len > ((to | 0x1ff) + 1)) - len = ((to | 0x1ff) + 1) - to; - - /* The ECC will not be calculated correctly if less than 512 is written */ -/* DBB- - if (len != 0x200 && eccbuf) - printk(KERN_WARNING - "ECC needs a full sector write (adr: %lx size %lx)\n", - (long) to, (long) len); - -DBB */ - - /* printk("DoC_Write (adr: %lx size %lx)\n", (long) to, (long) len); */ - - /* Find the chip which is to be used and select it */ - mychip = &this->chips[to >> (this->chipshift)]; - - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(this, mychip->floor); - DoC_SelectChip(this, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(this, mychip->chip); - } - - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* Set device to main plane of flash */ - DoC_Command(this, NAND_CMD_RESET, CDSN_CTRL_WP); - DoC_Command(this, - (!this->page256 - && (to & 0x100)) ? NAND_CMD_READ1 : NAND_CMD_READ0, - CDSN_CTRL_WP); - - DoC_Command(this, NAND_CMD_SEQIN, 0); - DoC_Address(this, ADDR_COLUMN_PAGE, to, 0, CDSN_CTRL_ECC_IO); - - /* Prime the ECC engine */ - WriteDOC(DOC_ECC_RESET, docptr, ECCConf); - WriteDOC(DOC_ECC_EN | DOC_ECC_RW, docptr, ECCConf); - - /* treat crossing 256-byte sector for 2M x 8bits devices */ - if (this->page256 && to + len > (to | 0xff) + 1) { - len256 = (to | 0xff) + 1 - to; - DoC_WriteBuf(this, buf, len256); - - DoC_Command(this, NAND_CMD_PAGEPROG, 0); - - DoC_Command(this, NAND_CMD_STATUS, CDSN_CTRL_WP); - /* There's an implicit DoC_WaitReady() in DoC_Command */ - - dummy = ReadDOC(docptr, CDSNSlowIO); - DoC_Delay(this, 2); - - if (ReadDOC_(docptr, this->ioreg) & 1) { - printk(KERN_ERR "Error programming flash\n"); - /* Error in programming */ - *retlen = 0; - mutex_unlock(&this->lock); - return -EIO; - } - - DoC_Command(this, NAND_CMD_SEQIN, 0); - DoC_Address(this, ADDR_COLUMN_PAGE, to + len256, 0, - CDSN_CTRL_ECC_IO); - } - - DoC_WriteBuf(this, &buf[len256], len - len256); - - WriteDOC(CDSN_CTRL_ECC_IO | CDSN_CTRL_CE, docptr, CDSNControl); - - if (DoC_is_Millennium(this)) { - WriteDOC(0, docptr, NOP); - WriteDOC(0, docptr, NOP); - WriteDOC(0, docptr, NOP); - } else { - WriteDOC_(0, docptr, this->ioreg); - WriteDOC_(0, docptr, this->ioreg); - WriteDOC_(0, docptr, this->ioreg); - } - - WriteDOC(CDSN_CTRL_ECC_IO | CDSN_CTRL_FLASH_IO | CDSN_CTRL_CE, docptr, - CDSNControl); - - /* Read the ECC data through the DiskOnChip ECC logic */ - for (di = 0; di < 6; di++) { - eccbuf[di] = ReadDOC(docptr, ECCSyndrome0 + di); - } - - /* Reset the ECC engine */ - WriteDOC(DOC_ECC_DIS, docptr, ECCConf); - -#ifdef PSYCHO_DEBUG - printk - ("OOB data at %lx is %2.2X %2.2X %2.2X %2.2X %2.2X %2.2X\n", - (long) to, eccbuf[0], eccbuf[1], eccbuf[2], eccbuf[3], - eccbuf[4], eccbuf[5]); -#endif - DoC_Command(this, NAND_CMD_PAGEPROG, 0); - - DoC_Command(this, NAND_CMD_STATUS, CDSN_CTRL_WP); - /* There's an implicit DoC_WaitReady() in DoC_Command */ - - if (DoC_is_Millennium(this)) { - ReadDOC(docptr, ReadPipeInit); - status = ReadDOC(docptr, LastDataRead); - } else { - dummy = ReadDOC(docptr, CDSNSlowIO); - DoC_Delay(this, 2); - status = ReadDOC_(docptr, this->ioreg); - } - - if (status & 1) { - printk(KERN_ERR "Error programming flash\n"); - /* Error in programming */ - *retlen = 0; - mutex_unlock(&this->lock); - return -EIO; - } - - /* Let the caller know we completed it */ - *retlen += len; - - { - unsigned char x[8]; - size_t dummy; - int ret; - - /* Write the ECC data to flash */ - for (di=0; di<6; di++) - x[di] = eccbuf[di]; - - x[6]=0x55; - x[7]=0x55; - - ret = doc_write_oob_nolock(mtd, to, 8, &dummy, x); - if (ret) { - mutex_unlock(&this->lock); - return ret; - } - } - - to += len; - left -= len; - buf += len; - } - - mutex_unlock(&this->lock); - return 0; -} - -static int doc_read_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops) -{ - struct DiskOnChip *this = mtd->priv; - int len256 = 0, ret; - struct Nand *mychip; - uint8_t *buf = ops->oobbuf; - size_t len = ops->len; - - BUG_ON(ops->mode != MTD_OPS_PLACE_OOB); - - ofs += ops->ooboffs; - - mutex_lock(&this->lock); - - mychip = &this->chips[ofs >> this->chipshift]; - - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(this, mychip->floor); - DoC_SelectChip(this, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(this, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* update address for 2M x 8bit devices. OOB starts on the second */ - /* page to maintain compatibility with doc_read_ecc. */ - if (this->page256) { - if (!(ofs & 0x8)) - ofs += 0x100; - else - ofs -= 0x8; - } - - DoC_Command(this, NAND_CMD_READOOB, CDSN_CTRL_WP); - DoC_Address(this, ADDR_COLUMN_PAGE, ofs, CDSN_CTRL_WP, 0); - - /* treat crossing 8-byte OOB data for 2M x 8bit devices */ - /* Note: datasheet says it should automaticaly wrap to the */ - /* next OOB block, but it didn't work here. mf. */ - if (this->page256 && ofs + len > (ofs | 0x7) + 1) { - len256 = (ofs | 0x7) + 1 - ofs; - DoC_ReadBuf(this, buf, len256); - - DoC_Command(this, NAND_CMD_READOOB, CDSN_CTRL_WP); - DoC_Address(this, ADDR_COLUMN_PAGE, ofs & (~0x1ff), - CDSN_CTRL_WP, 0); - } - - DoC_ReadBuf(this, &buf[len256], len - len256); - - ops->retlen = len; - /* Reading the full OOB data drops us off of the end of the page, - * causing the flash device to go into busy mode, so we need - * to wait until ready 11.4.1 and Toshiba TC58256FT docs */ - - ret = DoC_WaitReady(this); - - mutex_unlock(&this->lock); - return ret; - -} - -static int doc_write_oob_nolock(struct mtd_info *mtd, loff_t ofs, size_t len, - size_t * retlen, const u_char * buf) -{ - struct DiskOnChip *this = mtd->priv; - int len256 = 0; - void __iomem *docptr = this->virtadr; - struct Nand *mychip = &this->chips[ofs >> this->chipshift]; - volatile int dummy; - int status; - - // printk("doc_write_oob(%lx, %d): %2.2X %2.2X %2.2X %2.2X ... %2.2X %2.2X .. %2.2X %2.2X\n",(long)ofs, len, - // buf[0], buf[1], buf[2], buf[3], buf[8], buf[9], buf[14],buf[15]); - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(this, mychip->floor); - DoC_SelectChip(this, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(this, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* disable the ECC engine */ - WriteDOC (DOC_ECC_RESET, docptr, ECCConf); - WriteDOC (DOC_ECC_DIS, docptr, ECCConf); - - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(this, NAND_CMD_RESET, CDSN_CTRL_WP); - - /* issue the Read2 command to set the pointer to the Spare Data Area. */ - DoC_Command(this, NAND_CMD_READOOB, CDSN_CTRL_WP); - - /* update address for 2M x 8bit devices. OOB starts on the second */ - /* page to maintain compatibility with doc_read_ecc. */ - if (this->page256) { - if (!(ofs & 0x8)) - ofs += 0x100; - else - ofs -= 0x8; - } - - /* issue the Serial Data In command to initial the Page Program process */ - DoC_Command(this, NAND_CMD_SEQIN, 0); - DoC_Address(this, ADDR_COLUMN_PAGE, ofs, 0, 0); - - /* treat crossing 8-byte OOB data for 2M x 8bit devices */ - /* Note: datasheet says it should automaticaly wrap to the */ - /* next OOB block, but it didn't work here. mf. */ - if (this->page256 && ofs + len > (ofs | 0x7) + 1) { - len256 = (ofs | 0x7) + 1 - ofs; - DoC_WriteBuf(this, buf, len256); - - DoC_Command(this, NAND_CMD_PAGEPROG, 0); - DoC_Command(this, NAND_CMD_STATUS, 0); - /* DoC_WaitReady() is implicit in DoC_Command */ - - if (DoC_is_Millennium(this)) { - ReadDOC(docptr, ReadPipeInit); - status = ReadDOC(docptr, LastDataRead); - } else { - dummy = ReadDOC(docptr, CDSNSlowIO); - DoC_Delay(this, 2); - status = ReadDOC_(docptr, this->ioreg); - } - - if (status & 1) { - printk(KERN_ERR "Error programming oob data\n"); - /* There was an error */ - *retlen = 0; - return -EIO; - } - DoC_Command(this, NAND_CMD_SEQIN, 0); - DoC_Address(this, ADDR_COLUMN_PAGE, ofs & (~0x1ff), 0, 0); - } - - DoC_WriteBuf(this, &buf[len256], len - len256); - - DoC_Command(this, NAND_CMD_PAGEPROG, 0); - DoC_Command(this, NAND_CMD_STATUS, 0); - /* DoC_WaitReady() is implicit in DoC_Command */ - - if (DoC_is_Millennium(this)) { - ReadDOC(docptr, ReadPipeInit); - status = ReadDOC(docptr, LastDataRead); - } else { - dummy = ReadDOC(docptr, CDSNSlowIO); - DoC_Delay(this, 2); - status = ReadDOC_(docptr, this->ioreg); - } - - if (status & 1) { - printk(KERN_ERR "Error programming oob data\n"); - /* There was an error */ - *retlen = 0; - return -EIO; - } - - *retlen = len; - return 0; - -} - -static int doc_write_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops) -{ - struct DiskOnChip *this = mtd->priv; - int ret; - - BUG_ON(ops->mode != MTD_OPS_PLACE_OOB); - - mutex_lock(&this->lock); - ret = doc_write_oob_nolock(mtd, ofs + ops->ooboffs, ops->len, - &ops->retlen, ops->oobbuf); - - mutex_unlock(&this->lock); - return ret; -} - -static int doc_erase(struct mtd_info *mtd, struct erase_info *instr) -{ - struct DiskOnChip *this = mtd->priv; - __u32 ofs = instr->addr; - __u32 len = instr->len; - volatile int dummy; - void __iomem *docptr = this->virtadr; - struct Nand *mychip; - int status; - - mutex_lock(&this->lock); - - if (ofs & (mtd->erasesize-1) || len & (mtd->erasesize-1)) { - mutex_unlock(&this->lock); - return -EINVAL; - } - - instr->state = MTD_ERASING; - - /* FIXME: Do this in the background. Use timers or schedule_task() */ - while(len) { - mychip = &this->chips[ofs >> this->chipshift]; - - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(this, mychip->floor); - DoC_SelectChip(this, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(this, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - DoC_Command(this, NAND_CMD_ERASE1, 0); - DoC_Address(this, ADDR_PAGE, ofs, 0, 0); - DoC_Command(this, NAND_CMD_ERASE2, 0); - - DoC_Command(this, NAND_CMD_STATUS, CDSN_CTRL_WP); - - if (DoC_is_Millennium(this)) { - ReadDOC(docptr, ReadPipeInit); - status = ReadDOC(docptr, LastDataRead); - } else { - dummy = ReadDOC(docptr, CDSNSlowIO); - DoC_Delay(this, 2); - status = ReadDOC_(docptr, this->ioreg); - } - - if (status & 1) { - printk(KERN_ERR "Error erasing at 0x%x\n", ofs); - /* There was an error */ - instr->state = MTD_ERASE_FAILED; - goto callback; - } - ofs += mtd->erasesize; - len -= mtd->erasesize; - } - instr->state = MTD_ERASE_DONE; - - callback: - mtd_erase_callback(instr); - - mutex_unlock(&this->lock); - return 0; -} - - -/**************************************************************************** - * - * Module stuff - * - ****************************************************************************/ - -static void __exit cleanup_doc2000(void) -{ - struct mtd_info *mtd; - struct DiskOnChip *this; - - while ((mtd = doc2klist)) { - this = mtd->priv; - doc2klist = this->nextdoc; - - mtd_device_unregister(mtd); - - iounmap(this->virtadr); - kfree(this->chips); - kfree(mtd); - } -} - -module_exit(cleanup_doc2000); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("David Woodhouse et al."); -MODULE_DESCRIPTION("MTD driver for DiskOnChip 2000 and Millennium"); - diff --git a/drivers/mtd/devices/doc2001.c b/drivers/mtd/devices/doc2001.c deleted file mode 100644 index f6927955dab0..000000000000 --- a/drivers/mtd/devices/doc2001.c +++ /dev/null @@ -1,824 +0,0 @@ - -/* - * Linux driver for Disk-On-Chip Millennium - * (c) 1999 Machine Vision Holdings, Inc. - * (c) 1999, 2000 David Woodhouse - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include - -/* #define ECC_DEBUG */ - -/* I have no idea why some DoC chips can not use memcop_form|to_io(). - * This may be due to the different revisions of the ASIC controller built-in or - * simplily a QA/Bug issue. Who knows ?? If you have trouble, please uncomment - * this:*/ -#undef USE_MEMCPY - -static int doc_read(struct mtd_info *mtd, loff_t from, size_t len, - size_t *retlen, u_char *buf); -static int doc_write(struct mtd_info *mtd, loff_t to, size_t len, - size_t *retlen, const u_char *buf); -static int doc_read_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops); -static int doc_write_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops); -static int doc_erase (struct mtd_info *mtd, struct erase_info *instr); - -static struct mtd_info *docmillist = NULL; - -/* Perform the required delay cycles by reading from the NOP register */ -static void DoC_Delay(void __iomem * docptr, unsigned short cycles) -{ - volatile char dummy; - int i; - - for (i = 0; i < cycles; i++) - dummy = ReadDOC(docptr, NOP); -} - -/* DOC_WaitReady: Wait for RDY line to be asserted by the flash chip */ -static int _DoC_WaitReady(void __iomem * docptr) -{ - unsigned short c = 0xffff; - - pr_debug("_DoC_WaitReady called for out-of-line wait\n"); - - /* Out-of-line routine to wait for chip response */ - while (!(ReadDOC(docptr, CDSNControl) & CDSN_CTRL_FR_B) && --c) - ; - - if (c == 0) - pr_debug("_DoC_WaitReady timed out.\n"); - - return (c == 0); -} - -static inline int DoC_WaitReady(void __iomem * docptr) -{ - /* This is inline, to optimise the common case, where it's ready instantly */ - int ret = 0; - - /* 4 read form NOP register should be issued in prior to the read from CDSNControl - see Software Requirement 11.4 item 2. */ - DoC_Delay(docptr, 4); - - if (!(ReadDOC(docptr, CDSNControl) & CDSN_CTRL_FR_B)) - /* Call the out-of-line routine to wait */ - ret = _DoC_WaitReady(docptr); - - /* issue 2 read from NOP register after reading from CDSNControl register - see Software Requirement 11.4 item 2. */ - DoC_Delay(docptr, 2); - - return ret; -} - -/* DoC_Command: Send a flash command to the flash chip through the CDSN IO register - with the internal pipeline. Each of 4 delay cycles (read from the NOP register) is - required after writing to CDSN Control register, see Software Requirement 11.4 item 3. */ - -static void DoC_Command(void __iomem * docptr, unsigned char command, - unsigned char xtraflags) -{ - /* Assert the CLE (Command Latch Enable) line to the flash chip */ - WriteDOC(xtraflags | CDSN_CTRL_CLE | CDSN_CTRL_CE, docptr, CDSNControl); - DoC_Delay(docptr, 4); - - /* Send the command */ - WriteDOC(command, docptr, Mil_CDSN_IO); - WriteDOC(0x00, docptr, WritePipeTerm); - - /* Lower the CLE line */ - WriteDOC(xtraflags | CDSN_CTRL_CE, docptr, CDSNControl); - DoC_Delay(docptr, 4); -} - -/* DoC_Address: Set the current address for the flash chip through the CDSN IO register - with the internal pipeline. Each of 4 delay cycles (read from the NOP register) is - required after writing to CDSN Control register, see Software Requirement 11.4 item 3. */ - -static inline void DoC_Address(void __iomem * docptr, int numbytes, unsigned long ofs, - unsigned char xtraflags1, unsigned char xtraflags2) -{ - /* Assert the ALE (Address Latch Enable) line to the flash chip */ - WriteDOC(xtraflags1 | CDSN_CTRL_ALE | CDSN_CTRL_CE, docptr, CDSNControl); - DoC_Delay(docptr, 4); - - /* Send the address */ - switch (numbytes) - { - case 1: - /* Send single byte, bits 0-7. */ - WriteDOC(ofs & 0xff, docptr, Mil_CDSN_IO); - WriteDOC(0x00, docptr, WritePipeTerm); - break; - case 2: - /* Send bits 9-16 followed by 17-23 */ - WriteDOC((ofs >> 9) & 0xff, docptr, Mil_CDSN_IO); - WriteDOC((ofs >> 17) & 0xff, docptr, Mil_CDSN_IO); - WriteDOC(0x00, docptr, WritePipeTerm); - break; - case 3: - /* Send 0-7, 9-16, then 17-23 */ - WriteDOC(ofs & 0xff, docptr, Mil_CDSN_IO); - WriteDOC((ofs >> 9) & 0xff, docptr, Mil_CDSN_IO); - WriteDOC((ofs >> 17) & 0xff, docptr, Mil_CDSN_IO); - WriteDOC(0x00, docptr, WritePipeTerm); - break; - default: - return; - } - - /* Lower the ALE line */ - WriteDOC(xtraflags1 | xtraflags2 | CDSN_CTRL_CE, docptr, CDSNControl); - DoC_Delay(docptr, 4); -} - -/* DoC_SelectChip: Select a given flash chip within the current floor */ -static int DoC_SelectChip(void __iomem * docptr, int chip) -{ - /* Select the individual flash chip requested */ - WriteDOC(chip, docptr, CDSNDeviceSelect); - DoC_Delay(docptr, 4); - - /* Wait for it to be ready */ - return DoC_WaitReady(docptr); -} - -/* DoC_SelectFloor: Select a given floor (bank of flash chips) */ -static int DoC_SelectFloor(void __iomem * docptr, int floor) -{ - /* Select the floor (bank) of chips required */ - WriteDOC(floor, docptr, FloorSelect); - - /* Wait for the chip to be ready */ - return DoC_WaitReady(docptr); -} - -/* DoC_IdentChip: Identify a given NAND chip given {floor,chip} */ -static int DoC_IdentChip(struct DiskOnChip *doc, int floor, int chip) -{ - int mfr, id, i, j; - volatile char dummy; - - /* Page in the required floor/chip - FIXME: is this supported by Millennium ?? */ - DoC_SelectFloor(doc->virtadr, floor); - DoC_SelectChip(doc->virtadr, chip); - - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(doc->virtadr, NAND_CMD_RESET, CDSN_CTRL_WP); - DoC_WaitReady(doc->virtadr); - - /* Read the NAND chip ID: 1. Send ReadID command */ - DoC_Command(doc->virtadr, NAND_CMD_READID, CDSN_CTRL_WP); - - /* Read the NAND chip ID: 2. Send address byte zero */ - DoC_Address(doc->virtadr, 1, 0x00, CDSN_CTRL_WP, 0x00); - - /* Read the manufacturer and device id codes of the flash device through - CDSN IO register see Software Requirement 11.4 item 5.*/ - dummy = ReadDOC(doc->virtadr, ReadPipeInit); - DoC_Delay(doc->virtadr, 2); - mfr = ReadDOC(doc->virtadr, Mil_CDSN_IO); - - DoC_Delay(doc->virtadr, 2); - id = ReadDOC(doc->virtadr, Mil_CDSN_IO); - dummy = ReadDOC(doc->virtadr, LastDataRead); - - /* No response - return failure */ - if (mfr == 0xff || mfr == 0) - return 0; - - /* FIXME: to deal with multi-flash on multi-Millennium case more carefully */ - for (i = 0; nand_flash_ids[i].name != NULL; i++) { - if ( id == nand_flash_ids[i].id) { - /* Try to identify manufacturer */ - for (j = 0; nand_manuf_ids[j].id != 0x0; j++) { - if (nand_manuf_ids[j].id == mfr) - break; - } - printk(KERN_INFO "Flash chip found: Manufacturer ID: %2.2X, " - "Chip ID: %2.2X (%s:%s)\n", - mfr, id, nand_manuf_ids[j].name, nand_flash_ids[i].name); - doc->mfr = mfr; - doc->id = id; - doc->chipshift = ffs((nand_flash_ids[i].chipsize << 20)) - 1; - break; - } - } - - if (nand_flash_ids[i].name == NULL) - return 0; - else - return 1; -} - -/* DoC_ScanChips: Find all NAND chips present in a DiskOnChip, and identify them */ -static void DoC_ScanChips(struct DiskOnChip *this) -{ - int floor, chip; - int numchips[MAX_FLOORS_MIL]; - int ret; - - this->numchips = 0; - this->mfr = 0; - this->id = 0; - - /* For each floor, find the number of valid chips it contains */ - for (floor = 0,ret = 1; floor < MAX_FLOORS_MIL; floor++) { - numchips[floor] = 0; - for (chip = 0; chip < MAX_CHIPS_MIL && ret != 0; chip++) { - ret = DoC_IdentChip(this, floor, chip); - if (ret) { - numchips[floor]++; - this->numchips++; - } - } - } - /* If there are none at all that we recognise, bail */ - if (!this->numchips) { - printk("No flash chips recognised.\n"); - return; - } - - /* Allocate an array to hold the information for each chip */ - this->chips = kmalloc(sizeof(struct Nand) * this->numchips, GFP_KERNEL); - if (!this->chips){ - printk("No memory for allocating chip info structures\n"); - return; - } - - /* Fill out the chip array with {floor, chipno} for each - * detected chip in the device. */ - for (floor = 0, ret = 0; floor < MAX_FLOORS_MIL; floor++) { - for (chip = 0 ; chip < numchips[floor] ; chip++) { - this->chips[ret].floor = floor; - this->chips[ret].chip = chip; - this->chips[ret].curadr = 0; - this->chips[ret].curmode = 0x50; - ret++; - } - } - - /* Calculate and print the total size of the device */ - this->totlen = this->numchips * (1 << this->chipshift); - printk(KERN_INFO "%d flash chips found. Total DiskOnChip size: %ld MiB\n", - this->numchips ,this->totlen >> 20); -} - -static int DoCMil_is_alias(struct DiskOnChip *doc1, struct DiskOnChip *doc2) -{ - int tmp1, tmp2, retval; - - if (doc1->physadr == doc2->physadr) - return 1; - - /* Use the alias resolution register which was set aside for this - * purpose. If it's value is the same on both chips, they might - * be the same chip, and we write to one and check for a change in - * the other. It's unclear if this register is usuable in the - * DoC 2000 (it's in the Millenium docs), but it seems to work. */ - tmp1 = ReadDOC(doc1->virtadr, AliasResolution); - tmp2 = ReadDOC(doc2->virtadr, AliasResolution); - if (tmp1 != tmp2) - return 0; - - WriteDOC((tmp1+1) % 0xff, doc1->virtadr, AliasResolution); - tmp2 = ReadDOC(doc2->virtadr, AliasResolution); - if (tmp2 == (tmp1+1) % 0xff) - retval = 1; - else - retval = 0; - - /* Restore register contents. May not be necessary, but do it just to - * be safe. */ - WriteDOC(tmp1, doc1->virtadr, AliasResolution); - - return retval; -} - -/* This routine is found from the docprobe code by symbol_get(), - * which will bump the use count of this module. */ -void DoCMil_init(struct mtd_info *mtd) -{ - struct DiskOnChip *this = mtd->priv; - struct DiskOnChip *old = NULL; - - /* We must avoid being called twice for the same device. */ - if (docmillist) - old = docmillist->priv; - - while (old) { - if (DoCMil_is_alias(this, old)) { - printk(KERN_NOTICE "Ignoring DiskOnChip Millennium at " - "0x%lX - already configured\n", this->physadr); - iounmap(this->virtadr); - kfree(mtd); - return; - } - if (old->nextdoc) - old = old->nextdoc->priv; - else - old = NULL; - } - - mtd->name = "DiskOnChip Millennium"; - printk(KERN_NOTICE "DiskOnChip Millennium found at address 0x%lX\n", - this->physadr); - - mtd->type = MTD_NANDFLASH; - mtd->flags = MTD_CAP_NANDFLASH; - - /* FIXME: erase size is not always 8KiB */ - mtd->erasesize = 0x2000; - mtd->writebufsize = mtd->writesize = 512; - mtd->oobsize = 16; - mtd->ecc_strength = 2; - mtd->owner = THIS_MODULE; - mtd->_erase = doc_erase; - mtd->_read = doc_read; - mtd->_write = doc_write; - mtd->_read_oob = doc_read_oob; - mtd->_write_oob = doc_write_oob; - this->curfloor = -1; - this->curchip = -1; - - /* Ident all the chips present. */ - DoC_ScanChips(this); - - if (!this->totlen) { - kfree(mtd); - iounmap(this->virtadr); - } else { - this->nextdoc = docmillist; - docmillist = mtd; - mtd->size = this->totlen; - mtd_device_register(mtd, NULL, 0); - return; - } -} -EXPORT_SYMBOL_GPL(DoCMil_init); - -static int doc_read (struct mtd_info *mtd, loff_t from, size_t len, - size_t *retlen, u_char *buf) -{ - int i, ret; - volatile char dummy; - unsigned char syndrome[6], eccbuf[6]; - struct DiskOnChip *this = mtd->priv; - void __iomem *docptr = this->virtadr; - struct Nand *mychip = &this->chips[from >> (this->chipshift)]; - - /* Don't allow a single read to cross a 512-byte block boundary */ - if (from + len > ((from | 0x1ff) + 1)) - len = ((from | 0x1ff) + 1) - from; - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* issue the Read0 or Read1 command depend on which half of the page - we are accessing. Polling the Flash Ready bit after issue 3 bytes - address in Sequence Read Mode, see Software Requirement 11.4 item 1.*/ - DoC_Command(docptr, (from >> 8) & 1, CDSN_CTRL_WP); - DoC_Address(docptr, 3, from, CDSN_CTRL_WP, 0x00); - DoC_WaitReady(docptr); - - /* init the ECC engine, see Reed-Solomon EDC/ECC 11.1 .*/ - WriteDOC (DOC_ECC_RESET, docptr, ECCConf); - WriteDOC (DOC_ECC_EN, docptr, ECCConf); - - /* Read the data via the internal pipeline through CDSN IO register, - see Pipelined Read Operations 11.3 */ - dummy = ReadDOC(docptr, ReadPipeInit); -#ifndef USE_MEMCPY - for (i = 0; i < len-1; i++) { - /* N.B. you have to increase the source address in this way or the - ECC logic will not work properly */ - buf[i] = ReadDOC(docptr, Mil_CDSN_IO + (i & 0xff)); - } -#else - memcpy_fromio(buf, docptr + DoC_Mil_CDSN_IO, len - 1); -#endif - buf[len - 1] = ReadDOC(docptr, LastDataRead); - - /* Let the caller know we completed it */ - *retlen = len; - ret = 0; - - /* Read the ECC data from Spare Data Area, - see Reed-Solomon EDC/ECC 11.1 */ - dummy = ReadDOC(docptr, ReadPipeInit); -#ifndef USE_MEMCPY - for (i = 0; i < 5; i++) { - /* N.B. you have to increase the source address in this way or the - ECC logic will not work properly */ - eccbuf[i] = ReadDOC(docptr, Mil_CDSN_IO + i); - } -#else - memcpy_fromio(eccbuf, docptr + DoC_Mil_CDSN_IO, 5); -#endif - eccbuf[5] = ReadDOC(docptr, LastDataRead); - - /* Flush the pipeline */ - dummy = ReadDOC(docptr, ECCConf); - dummy = ReadDOC(docptr, ECCConf); - - /* Check the ECC Status */ - if (ReadDOC(docptr, ECCConf) & 0x80) { - int nb_errors; - /* There was an ECC error */ -#ifdef ECC_DEBUG - printk("DiskOnChip ECC Error: Read at %lx\n", (long)from); -#endif - /* Read the ECC syndrome through the DiskOnChip ECC logic. - These syndrome will be all ZERO when there is no error */ - for (i = 0; i < 6; i++) { - syndrome[i] = ReadDOC(docptr, ECCSyndrome0 + i); - } - nb_errors = doc_decode_ecc(buf, syndrome); -#ifdef ECC_DEBUG - printk("ECC Errors corrected: %x\n", nb_errors); -#endif - if (nb_errors < 0) { - /* We return error, but have actually done the read. Not that - this can be told to user-space, via sys_read(), but at least - MTD-aware stuff can know about it by checking *retlen */ - ret = -EIO; - } - } - -#ifdef PSYCHO_DEBUG - printk("ECC DATA at %lx: %2.2X %2.2X %2.2X %2.2X %2.2X %2.2X\n", - (long)from, eccbuf[0], eccbuf[1], eccbuf[2], eccbuf[3], - eccbuf[4], eccbuf[5]); -#endif - - /* disable the ECC engine */ - WriteDOC(DOC_ECC_DIS, docptr , ECCConf); - - return ret; -} - -static int doc_write (struct mtd_info *mtd, loff_t to, size_t len, - size_t *retlen, const u_char *buf) -{ - int i,ret = 0; - char eccbuf[6]; - volatile char dummy; - struct DiskOnChip *this = mtd->priv; - void __iomem *docptr = this->virtadr; - struct Nand *mychip = &this->chips[to >> (this->chipshift)]; - -#if 0 - /* Don't allow a single write to cross a 512-byte block boundary */ - if (to + len > ( (to | 0x1ff) + 1)) - len = ((to | 0x1ff) + 1) - to; -#else - /* Don't allow writes which aren't exactly one block */ - if (to & 0x1ff || len != 0x200) - return -EINVAL; -#endif - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(docptr, NAND_CMD_RESET, 0x00); - DoC_WaitReady(docptr); - /* Set device to main plane of flash */ - DoC_Command(docptr, NAND_CMD_READ0, 0x00); - - /* issue the Serial Data In command to initial the Page Program process */ - DoC_Command(docptr, NAND_CMD_SEQIN, 0x00); - DoC_Address(docptr, 3, to, 0x00, 0x00); - DoC_WaitReady(docptr); - - /* init the ECC engine, see Reed-Solomon EDC/ECC 11.1 .*/ - WriteDOC (DOC_ECC_RESET, docptr, ECCConf); - WriteDOC (DOC_ECC_EN | DOC_ECC_RW, docptr, ECCConf); - - /* Write the data via the internal pipeline through CDSN IO register, - see Pipelined Write Operations 11.2 */ -#ifndef USE_MEMCPY - for (i = 0; i < len; i++) { - /* N.B. you have to increase the source address in this way or the - ECC logic will not work properly */ - WriteDOC(buf[i], docptr, Mil_CDSN_IO + i); - } -#else - memcpy_toio(docptr + DoC_Mil_CDSN_IO, buf, len); -#endif - WriteDOC(0x00, docptr, WritePipeTerm); - - /* Write ECC data to flash, the ECC info is generated by the DiskOnChip ECC logic - see Reed-Solomon EDC/ECC 11.1 */ - WriteDOC(0, docptr, NOP); - WriteDOC(0, docptr, NOP); - WriteDOC(0, docptr, NOP); - - /* Read the ECC data through the DiskOnChip ECC logic */ - for (i = 0; i < 6; i++) { - eccbuf[i] = ReadDOC(docptr, ECCSyndrome0 + i); - } - - /* ignore the ECC engine */ - WriteDOC(DOC_ECC_DIS, docptr , ECCConf); - -#ifndef USE_MEMCPY - /* Write the ECC data to flash */ - for (i = 0; i < 6; i++) { - /* N.B. you have to increase the source address in this way or the - ECC logic will not work properly */ - WriteDOC(eccbuf[i], docptr, Mil_CDSN_IO + i); - } -#else - memcpy_toio(docptr + DoC_Mil_CDSN_IO, eccbuf, 6); -#endif - - /* write the block status BLOCK_USED (0x5555) at the end of ECC data - FIXME: this is only a hack for programming the IPL area for LinuxBIOS - and should be replace with proper codes in user space utilities */ - WriteDOC(0x55, docptr, Mil_CDSN_IO); - WriteDOC(0x55, docptr, Mil_CDSN_IO + 1); - - WriteDOC(0x00, docptr, WritePipeTerm); - -#ifdef PSYCHO_DEBUG - printk("OOB data at %lx is %2.2X %2.2X %2.2X %2.2X %2.2X %2.2X\n", - (long) to, eccbuf[0], eccbuf[1], eccbuf[2], eccbuf[3], - eccbuf[4], eccbuf[5]); -#endif - - /* Commit the Page Program command and wait for ready - see Software Requirement 11.4 item 1.*/ - DoC_Command(docptr, NAND_CMD_PAGEPROG, 0x00); - DoC_WaitReady(docptr); - - /* Read the status of the flash device through CDSN IO register - see Software Requirement 11.4 item 5.*/ - DoC_Command(docptr, NAND_CMD_STATUS, CDSN_CTRL_WP); - dummy = ReadDOC(docptr, ReadPipeInit); - DoC_Delay(docptr, 2); - if (ReadDOC(docptr, Mil_CDSN_IO) & 1) { - printk("Error programming flash\n"); - /* Error in programming - FIXME: implement Bad Block Replacement (in nftl.c ??) */ - ret = -EIO; - } - dummy = ReadDOC(docptr, LastDataRead); - - /* Let the caller know we completed it */ - *retlen = len; - - return ret; -} - -static int doc_read_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops) -{ -#ifndef USE_MEMCPY - int i; -#endif - volatile char dummy; - struct DiskOnChip *this = mtd->priv; - void __iomem *docptr = this->virtadr; - struct Nand *mychip = &this->chips[ofs >> this->chipshift]; - uint8_t *buf = ops->oobbuf; - size_t len = ops->len; - - BUG_ON(ops->mode != MTD_OPS_PLACE_OOB); - - ofs += ops->ooboffs; - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* disable the ECC engine */ - WriteDOC (DOC_ECC_RESET, docptr, ECCConf); - WriteDOC (DOC_ECC_DIS, docptr, ECCConf); - - /* issue the Read2 command to set the pointer to the Spare Data Area. - Polling the Flash Ready bit after issue 3 bytes address in - Sequence Read Mode, see Software Requirement 11.4 item 1.*/ - DoC_Command(docptr, NAND_CMD_READOOB, CDSN_CTRL_WP); - DoC_Address(docptr, 3, ofs, CDSN_CTRL_WP, 0x00); - DoC_WaitReady(docptr); - - /* Read the data out via the internal pipeline through CDSN IO register, - see Pipelined Read Operations 11.3 */ - dummy = ReadDOC(docptr, ReadPipeInit); -#ifndef USE_MEMCPY - for (i = 0; i < len-1; i++) { - /* N.B. you have to increase the source address in this way or the - ECC logic will not work properly */ - buf[i] = ReadDOC(docptr, Mil_CDSN_IO + i); - } -#else - memcpy_fromio(buf, docptr + DoC_Mil_CDSN_IO, len - 1); -#endif - buf[len - 1] = ReadDOC(docptr, LastDataRead); - - ops->retlen = len; - - return 0; -} - -static int doc_write_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops) -{ -#ifndef USE_MEMCPY - int i; -#endif - volatile char dummy; - int ret = 0; - struct DiskOnChip *this = mtd->priv; - void __iomem *docptr = this->virtadr; - struct Nand *mychip = &this->chips[ofs >> this->chipshift]; - uint8_t *buf = ops->oobbuf; - size_t len = ops->len; - - BUG_ON(ops->mode != MTD_OPS_PLACE_OOB); - - ofs += ops->ooboffs; - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* disable the ECC engine */ - WriteDOC (DOC_ECC_RESET, docptr, ECCConf); - WriteDOC (DOC_ECC_DIS, docptr, ECCConf); - - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(docptr, NAND_CMD_RESET, CDSN_CTRL_WP); - DoC_WaitReady(docptr); - /* issue the Read2 command to set the pointer to the Spare Data Area. */ - DoC_Command(docptr, NAND_CMD_READOOB, CDSN_CTRL_WP); - - /* issue the Serial Data In command to initial the Page Program process */ - DoC_Command(docptr, NAND_CMD_SEQIN, 0x00); - DoC_Address(docptr, 3, ofs, 0x00, 0x00); - - /* Write the data via the internal pipeline through CDSN IO register, - see Pipelined Write Operations 11.2 */ -#ifndef USE_MEMCPY - for (i = 0; i < len; i++) { - /* N.B. you have to increase the source address in this way or the - ECC logic will not work properly */ - WriteDOC(buf[i], docptr, Mil_CDSN_IO + i); - } -#else - memcpy_toio(docptr + DoC_Mil_CDSN_IO, buf, len); -#endif - WriteDOC(0x00, docptr, WritePipeTerm); - - /* Commit the Page Program command and wait for ready - see Software Requirement 11.4 item 1.*/ - DoC_Command(docptr, NAND_CMD_PAGEPROG, 0x00); - DoC_WaitReady(docptr); - - /* Read the status of the flash device through CDSN IO register - see Software Requirement 11.4 item 5.*/ - DoC_Command(docptr, NAND_CMD_STATUS, 0x00); - dummy = ReadDOC(docptr, ReadPipeInit); - DoC_Delay(docptr, 2); - if (ReadDOC(docptr, Mil_CDSN_IO) & 1) { - printk("Error programming oob data\n"); - /* FIXME: implement Bad Block Replacement (in nftl.c ??) */ - ops->retlen = 0; - ret = -EIO; - } - dummy = ReadDOC(docptr, LastDataRead); - - ops->retlen = len; - - return ret; -} - -int doc_erase (struct mtd_info *mtd, struct erase_info *instr) -{ - volatile char dummy; - struct DiskOnChip *this = mtd->priv; - __u32 ofs = instr->addr; - __u32 len = instr->len; - void __iomem *docptr = this->virtadr; - struct Nand *mychip = &this->chips[ofs >> this->chipshift]; - - if (len != mtd->erasesize) - printk(KERN_WARNING "Erase not right size (%x != %x)n", - len, mtd->erasesize); - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - instr->state = MTD_ERASE_PENDING; - - /* issue the Erase Setup command */ - DoC_Command(docptr, NAND_CMD_ERASE1, 0x00); - DoC_Address(docptr, 2, ofs, 0x00, 0x00); - - /* Commit the Erase Start command and wait for ready - see Software Requirement 11.4 item 1.*/ - DoC_Command(docptr, NAND_CMD_ERASE2, 0x00); - DoC_WaitReady(docptr); - - instr->state = MTD_ERASING; - - /* Read the status of the flash device through CDSN IO register - see Software Requirement 11.4 item 5. - FIXME: it seems that we are not wait long enough, some blocks are not - erased fully */ - DoC_Command(docptr, NAND_CMD_STATUS, CDSN_CTRL_WP); - dummy = ReadDOC(docptr, ReadPipeInit); - DoC_Delay(docptr, 2); - if (ReadDOC(docptr, Mil_CDSN_IO) & 1) { - printk("Error Erasing at 0x%x\n", ofs); - /* There was an error - FIXME: implement Bad Block Replacement (in nftl.c ??) */ - instr->state = MTD_ERASE_FAILED; - } else - instr->state = MTD_ERASE_DONE; - dummy = ReadDOC(docptr, LastDataRead); - - mtd_erase_callback(instr); - - return 0; -} - -/**************************************************************************** - * - * Module stuff - * - ****************************************************************************/ - -static void __exit cleanup_doc2001(void) -{ - struct mtd_info *mtd; - struct DiskOnChip *this; - - while ((mtd=docmillist)) { - this = mtd->priv; - docmillist = this->nextdoc; - - mtd_device_unregister(mtd); - - iounmap(this->virtadr); - kfree(this->chips); - kfree(mtd); - } -} - -module_exit(cleanup_doc2001); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("David Woodhouse et al."); -MODULE_DESCRIPTION("Alternative driver for DiskOnChip Millennium"); diff --git a/drivers/mtd/devices/doc2001plus.c b/drivers/mtd/devices/doc2001plus.c deleted file mode 100644 index 4f2220ad8924..000000000000 --- a/drivers/mtd/devices/doc2001plus.c +++ /dev/null @@ -1,1080 +0,0 @@ -/* - * Linux driver for Disk-On-Chip Millennium Plus - * - * (c) 2002-2003 Greg Ungerer - * (c) 2002-2003 SnapGear Inc - * (c) 1999 Machine Vision Holdings, Inc. - * (c) 1999, 2000 David Woodhouse - * - * Released under GPL - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include - -/* #define ECC_DEBUG */ - -/* I have no idea why some DoC chips can not use memcop_form|to_io(). - * This may be due to the different revisions of the ASIC controller built-in or - * simplily a QA/Bug issue. Who knows ?? If you have trouble, please uncomment - * this:*/ -#undef USE_MEMCPY - -static int doc_read(struct mtd_info *mtd, loff_t from, size_t len, - size_t *retlen, u_char *buf); -static int doc_write(struct mtd_info *mtd, loff_t to, size_t len, - size_t *retlen, const u_char *buf); -static int doc_read_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops); -static int doc_write_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops); -static int doc_erase (struct mtd_info *mtd, struct erase_info *instr); - -static struct mtd_info *docmilpluslist = NULL; - - -/* Perform the required delay cycles by writing to the NOP register */ -static void DoC_Delay(void __iomem * docptr, int cycles) -{ - int i; - - for (i = 0; (i < cycles); i++) - WriteDOC(0, docptr, Mplus_NOP); -} - -#define CDSN_CTRL_FR_B_MASK (CDSN_CTRL_FR_B0 | CDSN_CTRL_FR_B1) - -/* DOC_WaitReady: Wait for RDY line to be asserted by the flash chip */ -static int _DoC_WaitReady(void __iomem * docptr) -{ - unsigned int c = 0xffff; - - pr_debug("_DoC_WaitReady called for out-of-line wait\n"); - - /* Out-of-line routine to wait for chip response */ - while (((ReadDOC(docptr, Mplus_FlashControl) & CDSN_CTRL_FR_B_MASK) != CDSN_CTRL_FR_B_MASK) && --c) - ; - - if (c == 0) - pr_debug("_DoC_WaitReady timed out.\n"); - - return (c == 0); -} - -static inline int DoC_WaitReady(void __iomem * docptr) -{ - /* This is inline, to optimise the common case, where it's ready instantly */ - int ret = 0; - - /* read form NOP register should be issued prior to the read from CDSNControl - see Software Requirement 11.4 item 2. */ - DoC_Delay(docptr, 4); - - if ((ReadDOC(docptr, Mplus_FlashControl) & CDSN_CTRL_FR_B_MASK) != CDSN_CTRL_FR_B_MASK) - /* Call the out-of-line routine to wait */ - ret = _DoC_WaitReady(docptr); - - return ret; -} - -/* For some reason the Millennium Plus seems to occasionally put itself - * into reset mode. For me this happens randomly, with no pattern that I - * can detect. M-systems suggest always check this on any block level - * operation and setting to normal mode if in reset mode. - */ -static inline void DoC_CheckASIC(void __iomem * docptr) -{ - /* Make sure the DoC is in normal mode */ - if ((ReadDOC(docptr, Mplus_DOCControl) & DOC_MODE_NORMAL) == 0) { - WriteDOC((DOC_MODE_NORMAL | DOC_MODE_MDWREN), docptr, Mplus_DOCControl); - WriteDOC(~(DOC_MODE_NORMAL | DOC_MODE_MDWREN), docptr, Mplus_CtrlConfirm); - } -} - -/* DoC_Command: Send a flash command to the flash chip through the Flash - * command register. Need 2 Write Pipeline Terminates to complete send. - */ -static void DoC_Command(void __iomem * docptr, unsigned char command, - unsigned char xtraflags) -{ - WriteDOC(command, docptr, Mplus_FlashCmd); - WriteDOC(command, docptr, Mplus_WritePipeTerm); - WriteDOC(command, docptr, Mplus_WritePipeTerm); -} - -/* DoC_Address: Set the current address for the flash chip through the Flash - * Address register. Need 2 Write Pipeline Terminates to complete send. - */ -static inline void DoC_Address(struct DiskOnChip *doc, int numbytes, - unsigned long ofs, unsigned char xtraflags1, - unsigned char xtraflags2) -{ - void __iomem * docptr = doc->virtadr; - - /* Allow for possible Mill Plus internal flash interleaving */ - ofs >>= doc->interleave; - - switch (numbytes) { - case 1: - /* Send single byte, bits 0-7. */ - WriteDOC(ofs & 0xff, docptr, Mplus_FlashAddress); - break; - case 2: - /* Send bits 9-16 followed by 17-23 */ - WriteDOC((ofs >> 9) & 0xff, docptr, Mplus_FlashAddress); - WriteDOC((ofs >> 17) & 0xff, docptr, Mplus_FlashAddress); - break; - case 3: - /* Send 0-7, 9-16, then 17-23 */ - WriteDOC(ofs & 0xff, docptr, Mplus_FlashAddress); - WriteDOC((ofs >> 9) & 0xff, docptr, Mplus_FlashAddress); - WriteDOC((ofs >> 17) & 0xff, docptr, Mplus_FlashAddress); - break; - default: - return; - } - - WriteDOC(0x00, docptr, Mplus_WritePipeTerm); - WriteDOC(0x00, docptr, Mplus_WritePipeTerm); -} - -/* DoC_SelectChip: Select a given flash chip within the current floor */ -static int DoC_SelectChip(void __iomem * docptr, int chip) -{ - /* No choice for flash chip on Millennium Plus */ - return 0; -} - -/* DoC_SelectFloor: Select a given floor (bank of flash chips) */ -static int DoC_SelectFloor(void __iomem * docptr, int floor) -{ - WriteDOC((floor & 0x3), docptr, Mplus_DeviceSelect); - return 0; -} - -/* - * Translate the given offset into the appropriate command and offset. - * This does the mapping using the 16bit interleave layout defined by - * M-Systems, and looks like this for a sector pair: - * +-----------+-------+-------+-------+--------------+---------+-----------+ - * | 0 --- 511 |512-517|518-519|520-521| 522 --- 1033 |1034-1039|1040 - 1055| - * +-----------+-------+-------+-------+--------------+---------+-----------+ - * | Data 0 | ECC 0 |Flags0 |Flags1 | Data 1 |ECC 1 | OOB 1 + 2 | - * +-----------+-------+-------+-------+--------------+---------+-----------+ - */ -/* FIXME: This lives in INFTL not here. Other users of flash devices - may not want it */ -static unsigned int DoC_GetDataOffset(struct mtd_info *mtd, loff_t *from) -{ - struct DiskOnChip *this = mtd->priv; - - if (this->interleave) { - unsigned int ofs = *from & 0x3ff; - unsigned int cmd; - - if (ofs < 512) { - cmd = NAND_CMD_READ0; - ofs &= 0x1ff; - } else if (ofs < 1014) { - cmd = NAND_CMD_READ1; - ofs = (ofs & 0x1ff) + 10; - } else { - cmd = NAND_CMD_READOOB; - ofs = ofs - 1014; - } - - *from = (*from & ~0x3ff) | ofs; - return cmd; - } else { - /* No interleave */ - if ((*from) & 0x100) - return NAND_CMD_READ1; - return NAND_CMD_READ0; - } -} - -static unsigned int DoC_GetECCOffset(struct mtd_info *mtd, loff_t *from) -{ - unsigned int ofs, cmd; - - if (*from & 0x200) { - cmd = NAND_CMD_READOOB; - ofs = 10 + (*from & 0xf); - } else { - cmd = NAND_CMD_READ1; - ofs = (*from & 0xf); - } - - *from = (*from & ~0x3ff) | ofs; - return cmd; -} - -static unsigned int DoC_GetFlagsOffset(struct mtd_info *mtd, loff_t *from) -{ - unsigned int ofs, cmd; - - cmd = NAND_CMD_READ1; - ofs = (*from & 0x200) ? 8 : 6; - *from = (*from & ~0x3ff) | ofs; - return cmd; -} - -static unsigned int DoC_GetHdrOffset(struct mtd_info *mtd, loff_t *from) -{ - unsigned int ofs, cmd; - - cmd = NAND_CMD_READOOB; - ofs = (*from & 0x200) ? 24 : 16; - *from = (*from & ~0x3ff) | ofs; - return cmd; -} - -static inline void MemReadDOC(void __iomem * docptr, unsigned char *buf, int len) -{ -#ifndef USE_MEMCPY - int i; - for (i = 0; i < len; i++) - buf[i] = ReadDOC(docptr, Mil_CDSN_IO + i); -#else - memcpy_fromio(buf, docptr + DoC_Mil_CDSN_IO, len); -#endif -} - -static inline void MemWriteDOC(void __iomem * docptr, unsigned char *buf, int len) -{ -#ifndef USE_MEMCPY - int i; - for (i = 0; i < len; i++) - WriteDOC(buf[i], docptr, Mil_CDSN_IO + i); -#else - memcpy_toio(docptr + DoC_Mil_CDSN_IO, buf, len); -#endif -} - -/* DoC_IdentChip: Identify a given NAND chip given {floor,chip} */ -static int DoC_IdentChip(struct DiskOnChip *doc, int floor, int chip) -{ - int mfr, id, i, j; - volatile char dummy; - void __iomem * docptr = doc->virtadr; - - /* Page in the required floor/chip */ - DoC_SelectFloor(docptr, floor); - DoC_SelectChip(docptr, chip); - - /* Millennium Plus bus cycle sequence as per figure 2, section 2.4 */ - WriteDOC((DOC_FLASH_CE | DOC_FLASH_WP), docptr, Mplus_FlashSelect); - - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(docptr, NAND_CMD_RESET, 0); - DoC_WaitReady(docptr); - - /* Read the NAND chip ID: 1. Send ReadID command */ - DoC_Command(docptr, NAND_CMD_READID, 0); - - /* Read the NAND chip ID: 2. Send address byte zero */ - DoC_Address(doc, 1, 0x00, 0, 0x00); - - WriteDOC(0, docptr, Mplus_FlashControl); - DoC_WaitReady(docptr); - - /* Read the manufacturer and device id codes of the flash device through - CDSN IO register see Software Requirement 11.4 item 5.*/ - dummy = ReadDOC(docptr, Mplus_ReadPipeInit); - dummy = ReadDOC(docptr, Mplus_ReadPipeInit); - - mfr = ReadDOC(docptr, Mil_CDSN_IO); - if (doc->interleave) - dummy = ReadDOC(docptr, Mil_CDSN_IO); /* 2 way interleave */ - - id = ReadDOC(docptr, Mil_CDSN_IO); - if (doc->interleave) - dummy = ReadDOC(docptr, Mil_CDSN_IO); /* 2 way interleave */ - - dummy = ReadDOC(docptr, Mplus_LastDataRead); - dummy = ReadDOC(docptr, Mplus_LastDataRead); - - /* Disable flash internally */ - WriteDOC(0, docptr, Mplus_FlashSelect); - - /* No response - return failure */ - if (mfr == 0xff || mfr == 0) - return 0; - - for (i = 0; nand_flash_ids[i].name != NULL; i++) { - if (id == nand_flash_ids[i].id) { - /* Try to identify manufacturer */ - for (j = 0; nand_manuf_ids[j].id != 0x0; j++) { - if (nand_manuf_ids[j].id == mfr) - break; - } - printk(KERN_INFO "Flash chip found: Manufacturer ID: %2.2X, " - "Chip ID: %2.2X (%s:%s)\n", mfr, id, - nand_manuf_ids[j].name, nand_flash_ids[i].name); - doc->mfr = mfr; - doc->id = id; - doc->chipshift = ffs((nand_flash_ids[i].chipsize << 20)) - 1; - doc->erasesize = nand_flash_ids[i].erasesize << doc->interleave; - break; - } - } - - if (nand_flash_ids[i].name == NULL) - return 0; - return 1; -} - -/* DoC_ScanChips: Find all NAND chips present in a DiskOnChip, and identify them */ -static void DoC_ScanChips(struct DiskOnChip *this) -{ - int floor, chip; - int numchips[MAX_FLOORS_MPLUS]; - int ret; - - this->numchips = 0; - this->mfr = 0; - this->id = 0; - - /* Work out the intended interleave setting */ - this->interleave = 0; - if (this->ChipID == DOC_ChipID_DocMilPlus32) - this->interleave = 1; - - /* Check the ASIC agrees */ - if ( (this->interleave << 2) != - (ReadDOC(this->virtadr, Mplus_Configuration) & 4)) { - u_char conf = ReadDOC(this->virtadr, Mplus_Configuration); - printk(KERN_NOTICE "Setting DiskOnChip Millennium Plus interleave to %s\n", - this->interleave?"on (16-bit)":"off (8-bit)"); - conf ^= 4; - WriteDOC(conf, this->virtadr, Mplus_Configuration); - } - - /* For each floor, find the number of valid chips it contains */ - for (floor = 0,ret = 1; floor < MAX_FLOORS_MPLUS; floor++) { - numchips[floor] = 0; - for (chip = 0; chip < MAX_CHIPS_MPLUS && ret != 0; chip++) { - ret = DoC_IdentChip(this, floor, chip); - if (ret) { - numchips[floor]++; - this->numchips++; - } - } - } - /* If there are none at all that we recognise, bail */ - if (!this->numchips) { - printk("No flash chips recognised.\n"); - return; - } - - /* Allocate an array to hold the information for each chip */ - this->chips = kmalloc(sizeof(struct Nand) * this->numchips, GFP_KERNEL); - if (!this->chips){ - printk("MTD: No memory for allocating chip info structures\n"); - return; - } - - /* Fill out the chip array with {floor, chipno} for each - * detected chip in the device. */ - for (floor = 0, ret = 0; floor < MAX_FLOORS_MPLUS; floor++) { - for (chip = 0 ; chip < numchips[floor] ; chip++) { - this->chips[ret].floor = floor; - this->chips[ret].chip = chip; - this->chips[ret].curadr = 0; - this->chips[ret].curmode = 0x50; - ret++; - } - } - - /* Calculate and print the total size of the device */ - this->totlen = this->numchips * (1 << this->chipshift); - printk(KERN_INFO "%d flash chips found. Total DiskOnChip size: %ld MiB\n", - this->numchips ,this->totlen >> 20); -} - -static int DoCMilPlus_is_alias(struct DiskOnChip *doc1, struct DiskOnChip *doc2) -{ - int tmp1, tmp2, retval; - - if (doc1->physadr == doc2->physadr) - return 1; - - /* Use the alias resolution register which was set aside for this - * purpose. If it's value is the same on both chips, they might - * be the same chip, and we write to one and check for a change in - * the other. It's unclear if this register is usuable in the - * DoC 2000 (it's in the Millennium docs), but it seems to work. */ - tmp1 = ReadDOC(doc1->virtadr, Mplus_AliasResolution); - tmp2 = ReadDOC(doc2->virtadr, Mplus_AliasResolution); - if (tmp1 != tmp2) - return 0; - - WriteDOC((tmp1+1) % 0xff, doc1->virtadr, Mplus_AliasResolution); - tmp2 = ReadDOC(doc2->virtadr, Mplus_AliasResolution); - if (tmp2 == (tmp1+1) % 0xff) - retval = 1; - else - retval = 0; - - /* Restore register contents. May not be necessary, but do it just to - * be safe. */ - WriteDOC(tmp1, doc1->virtadr, Mplus_AliasResolution); - - return retval; -} - -/* This routine is found from the docprobe code by symbol_get(), - * which will bump the use count of this module. */ -void DoCMilPlus_init(struct mtd_info *mtd) -{ - struct DiskOnChip *this = mtd->priv; - struct DiskOnChip *old = NULL; - - /* We must avoid being called twice for the same device. */ - if (docmilpluslist) - old = docmilpluslist->priv; - - while (old) { - if (DoCMilPlus_is_alias(this, old)) { - printk(KERN_NOTICE "Ignoring DiskOnChip Millennium " - "Plus at 0x%lX - already configured\n", - this->physadr); - iounmap(this->virtadr); - kfree(mtd); - return; - } - if (old->nextdoc) - old = old->nextdoc->priv; - else - old = NULL; - } - - mtd->name = "DiskOnChip Millennium Plus"; - printk(KERN_NOTICE "DiskOnChip Millennium Plus found at " - "address 0x%lX\n", this->physadr); - - mtd->type = MTD_NANDFLASH; - mtd->flags = MTD_CAP_NANDFLASH; - mtd->writebufsize = mtd->writesize = 512; - mtd->oobsize = 16; - mtd->ecc_strength = 2; - mtd->owner = THIS_MODULE; - mtd->_erase = doc_erase; - mtd->_read = doc_read; - mtd->_write = doc_write; - mtd->_read_oob = doc_read_oob; - mtd->_write_oob = doc_write_oob; - this->curfloor = -1; - this->curchip = -1; - - /* Ident all the chips present. */ - DoC_ScanChips(this); - - if (!this->totlen) { - kfree(mtd); - iounmap(this->virtadr); - } else { - this->nextdoc = docmilpluslist; - docmilpluslist = mtd; - mtd->size = this->totlen; - mtd->erasesize = this->erasesize; - mtd_device_register(mtd, NULL, 0); - return; - } -} -EXPORT_SYMBOL_GPL(DoCMilPlus_init); - -#if 0 -static int doc_dumpblk(struct mtd_info *mtd, loff_t from) -{ - int i; - loff_t fofs; - struct DiskOnChip *this = mtd->priv; - void __iomem * docptr = this->virtadr; - struct Nand *mychip = &this->chips[from >> (this->chipshift)]; - unsigned char *bp, buf[1056]; - char c[32]; - - from &= ~0x3ff; - - /* Don't allow read past end of device */ - if (from >= this->totlen) - return -EINVAL; - - DoC_CheckASIC(docptr); - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* Millennium Plus bus cycle sequence as per figure 2, section 2.4 */ - WriteDOC((DOC_FLASH_CE | DOC_FLASH_WP), docptr, Mplus_FlashSelect); - - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(docptr, NAND_CMD_RESET, 0); - DoC_WaitReady(docptr); - - fofs = from; - DoC_Command(docptr, DoC_GetDataOffset(mtd, &fofs), 0); - DoC_Address(this, 3, fofs, 0, 0x00); - WriteDOC(0, docptr, Mplus_FlashControl); - DoC_WaitReady(docptr); - - /* disable the ECC engine */ - WriteDOC(DOC_ECC_RESET, docptr, Mplus_ECCConf); - - ReadDOC(docptr, Mplus_ReadPipeInit); - ReadDOC(docptr, Mplus_ReadPipeInit); - - /* Read the data via the internal pipeline through CDSN IO - register, see Pipelined Read Operations 11.3 */ - MemReadDOC(docptr, buf, 1054); - buf[1054] = ReadDOC(docptr, Mplus_LastDataRead); - buf[1055] = ReadDOC(docptr, Mplus_LastDataRead); - - memset(&c[0], 0, sizeof(c)); - printk("DUMP OFFSET=%x:\n", (int)from); - - for (i = 0, bp = &buf[0]; (i < 1056); i++) { - if ((i % 16) == 0) - printk("%08x: ", i); - printk(" %02x", *bp); - c[(i & 0xf)] = ((*bp >= 0x20) && (*bp <= 0x7f)) ? *bp : '.'; - bp++; - if (((i + 1) % 16) == 0) - printk(" %s\n", c); - } - printk("\n"); - - /* Disable flash internally */ - WriteDOC(0, docptr, Mplus_FlashSelect); - - return 0; -} -#endif - -static int doc_read(struct mtd_info *mtd, loff_t from, size_t len, - size_t *retlen, u_char *buf) -{ - int ret, i; - volatile char dummy; - loff_t fofs; - unsigned char syndrome[6], eccbuf[6]; - struct DiskOnChip *this = mtd->priv; - void __iomem * docptr = this->virtadr; - struct Nand *mychip = &this->chips[from >> (this->chipshift)]; - - /* Don't allow a single read to cross a 512-byte block boundary */ - if (from + len > ((from | 0x1ff) + 1)) - len = ((from | 0x1ff) + 1) - from; - - DoC_CheckASIC(docptr); - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* Millennium Plus bus cycle sequence as per figure 2, section 2.4 */ - WriteDOC((DOC_FLASH_CE | DOC_FLASH_WP), docptr, Mplus_FlashSelect); - - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(docptr, NAND_CMD_RESET, 0); - DoC_WaitReady(docptr); - - fofs = from; - DoC_Command(docptr, DoC_GetDataOffset(mtd, &fofs), 0); - DoC_Address(this, 3, fofs, 0, 0x00); - WriteDOC(0, docptr, Mplus_FlashControl); - DoC_WaitReady(docptr); - - /* init the ECC engine, see Reed-Solomon EDC/ECC 11.1 .*/ - WriteDOC(DOC_ECC_RESET, docptr, Mplus_ECCConf); - WriteDOC(DOC_ECC_EN, docptr, Mplus_ECCConf); - - /* Let the caller know we completed it */ - *retlen = len; - ret = 0; - - ReadDOC(docptr, Mplus_ReadPipeInit); - ReadDOC(docptr, Mplus_ReadPipeInit); - - /* Read the data via the internal pipeline through CDSN IO - register, see Pipelined Read Operations 11.3 */ - MemReadDOC(docptr, buf, len); - - /* Read the ECC data following raw data */ - MemReadDOC(docptr, eccbuf, 4); - eccbuf[4] = ReadDOC(docptr, Mplus_LastDataRead); - eccbuf[5] = ReadDOC(docptr, Mplus_LastDataRead); - - /* Flush the pipeline */ - dummy = ReadDOC(docptr, Mplus_ECCConf); - dummy = ReadDOC(docptr, Mplus_ECCConf); - - /* Check the ECC Status */ - if (ReadDOC(docptr, Mplus_ECCConf) & 0x80) { - int nb_errors; - /* There was an ECC error */ -#ifdef ECC_DEBUG - printk("DiskOnChip ECC Error: Read at %lx\n", (long)from); -#endif - /* Read the ECC syndrome through the DiskOnChip ECC logic. - These syndrome will be all ZERO when there is no error */ - for (i = 0; i < 6; i++) - syndrome[i] = ReadDOC(docptr, Mplus_ECCSyndrome0 + i); - - nb_errors = doc_decode_ecc(buf, syndrome); -#ifdef ECC_DEBUG - printk("ECC Errors corrected: %x\n", nb_errors); -#endif - if (nb_errors < 0) { - /* We return error, but have actually done the - read. Not that this can be told to user-space, via - sys_read(), but at least MTD-aware stuff can know - about it by checking *retlen */ -#ifdef ECC_DEBUG - printk("%s(%d): Millennium Plus ECC error (from=0x%x:\n", - __FILE__, __LINE__, (int)from); - printk(" syndrome= %*phC\n", 6, syndrome); - printk(" eccbuf= %*phC\n", 6, eccbuf); -#endif - ret = -EIO; - } - } - -#ifdef PSYCHO_DEBUG - printk("ECC DATA at %lx: %*ph\n", (long)from, 6, eccbuf); -#endif - /* disable the ECC engine */ - WriteDOC(DOC_ECC_DIS, docptr , Mplus_ECCConf); - - /* Disable flash internally */ - WriteDOC(0, docptr, Mplus_FlashSelect); - - return ret; -} - -static int doc_write(struct mtd_info *mtd, loff_t to, size_t len, - size_t *retlen, const u_char *buf) -{ - int i, before, ret = 0; - loff_t fto; - volatile char dummy; - char eccbuf[6]; - struct DiskOnChip *this = mtd->priv; - void __iomem * docptr = this->virtadr; - struct Nand *mychip = &this->chips[to >> (this->chipshift)]; - - /* Don't allow writes which aren't exactly one block (512 bytes) */ - if ((to & 0x1ff) || (len != 0x200)) - return -EINVAL; - - /* Determine position of OOB flags, before or after data */ - before = (this->interleave && (to & 0x200)); - - DoC_CheckASIC(docptr); - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* Millennium Plus bus cycle sequence as per figure 2, section 2.4 */ - WriteDOC(DOC_FLASH_CE, docptr, Mplus_FlashSelect); - - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(docptr, NAND_CMD_RESET, 0); - DoC_WaitReady(docptr); - - /* Set device to appropriate plane of flash */ - fto = to; - WriteDOC(DoC_GetDataOffset(mtd, &fto), docptr, Mplus_FlashCmd); - - /* On interleaved devices the flags for 2nd half 512 are before data */ - if (before) - fto -= 2; - - /* issue the Serial Data In command to initial the Page Program process */ - DoC_Command(docptr, NAND_CMD_SEQIN, 0x00); - DoC_Address(this, 3, fto, 0x00, 0x00); - - /* Disable the ECC engine */ - WriteDOC(DOC_ECC_RESET, docptr, Mplus_ECCConf); - - if (before) { - /* Write the block status BLOCK_USED (0x5555) */ - WriteDOC(0x55, docptr, Mil_CDSN_IO); - WriteDOC(0x55, docptr, Mil_CDSN_IO); - } - - /* init the ECC engine, see Reed-Solomon EDC/ECC 11.1 .*/ - WriteDOC(DOC_ECC_EN | DOC_ECC_RW, docptr, Mplus_ECCConf); - - MemWriteDOC(docptr, (unsigned char *) buf, len); - - /* Write ECC data to flash, the ECC info is generated by - the DiskOnChip ECC logic see Reed-Solomon EDC/ECC 11.1 */ - DoC_Delay(docptr, 3); - - /* Read the ECC data through the DiskOnChip ECC logic */ - for (i = 0; i < 6; i++) - eccbuf[i] = ReadDOC(docptr, Mplus_ECCSyndrome0 + i); - - /* disable the ECC engine */ - WriteDOC(DOC_ECC_DIS, docptr, Mplus_ECCConf); - - /* Write the ECC data to flash */ - MemWriteDOC(docptr, eccbuf, 6); - - if (!before) { - /* Write the block status BLOCK_USED (0x5555) */ - WriteDOC(0x55, docptr, Mil_CDSN_IO+6); - WriteDOC(0x55, docptr, Mil_CDSN_IO+7); - } - -#ifdef PSYCHO_DEBUG - printk("OOB data at %lx is %2.2X %2.2X %2.2X %2.2X %2.2X %2.2X\n", - (long) to, eccbuf[0], eccbuf[1], eccbuf[2], eccbuf[3], - eccbuf[4], eccbuf[5]); -#endif - - WriteDOC(0x00, docptr, Mplus_WritePipeTerm); - WriteDOC(0x00, docptr, Mplus_WritePipeTerm); - - /* Commit the Page Program command and wait for ready - see Software Requirement 11.4 item 1.*/ - DoC_Command(docptr, NAND_CMD_PAGEPROG, 0x00); - DoC_WaitReady(docptr); - - /* Read the status of the flash device through CDSN IO register - see Software Requirement 11.4 item 5.*/ - DoC_Command(docptr, NAND_CMD_STATUS, 0); - dummy = ReadDOC(docptr, Mplus_ReadPipeInit); - dummy = ReadDOC(docptr, Mplus_ReadPipeInit); - DoC_Delay(docptr, 2); - if ((dummy = ReadDOC(docptr, Mplus_LastDataRead)) & 1) { - printk("MTD: Error 0x%x programming at 0x%x\n", dummy, (int)to); - /* Error in programming - FIXME: implement Bad Block Replacement (in nftl.c ??) */ - ret = -EIO; - } - dummy = ReadDOC(docptr, Mplus_LastDataRead); - - /* Disable flash internally */ - WriteDOC(0, docptr, Mplus_FlashSelect); - - /* Let the caller know we completed it */ - *retlen = len; - - return ret; -} - -static int doc_read_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops) -{ - loff_t fofs, base; - struct DiskOnChip *this = mtd->priv; - void __iomem * docptr = this->virtadr; - struct Nand *mychip = &this->chips[ofs >> this->chipshift]; - size_t i, size, got, want; - uint8_t *buf = ops->oobbuf; - size_t len = ops->len; - - BUG_ON(ops->mode != MTD_OPS_PLACE_OOB); - - ofs += ops->ooboffs; - - DoC_CheckASIC(docptr); - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* Millennium Plus bus cycle sequence as per figure 2, section 2.4 */ - WriteDOC((DOC_FLASH_CE | DOC_FLASH_WP), docptr, Mplus_FlashSelect); - - /* disable the ECC engine */ - WriteDOC(DOC_ECC_RESET, docptr, Mplus_ECCConf); - DoC_WaitReady(docptr); - - /* Maximum of 16 bytes in the OOB region, so limit read to that */ - if (len > 16) - len = 16; - got = 0; - want = len; - - for (i = 0; ((i < 3) && (want > 0)); i++) { - /* Figure out which region we are accessing... */ - fofs = ofs; - base = ofs & 0xf; - if (!this->interleave) { - DoC_Command(docptr, NAND_CMD_READOOB, 0); - size = 16 - base; - } else if (base < 6) { - DoC_Command(docptr, DoC_GetECCOffset(mtd, &fofs), 0); - size = 6 - base; - } else if (base < 8) { - DoC_Command(docptr, DoC_GetFlagsOffset(mtd, &fofs), 0); - size = 8 - base; - } else { - DoC_Command(docptr, DoC_GetHdrOffset(mtd, &fofs), 0); - size = 16 - base; - } - if (size > want) - size = want; - - /* Issue read command */ - DoC_Address(this, 3, fofs, 0, 0x00); - WriteDOC(0, docptr, Mplus_FlashControl); - DoC_WaitReady(docptr); - - ReadDOC(docptr, Mplus_ReadPipeInit); - ReadDOC(docptr, Mplus_ReadPipeInit); - MemReadDOC(docptr, &buf[got], size - 2); - buf[got + size - 2] = ReadDOC(docptr, Mplus_LastDataRead); - buf[got + size - 1] = ReadDOC(docptr, Mplus_LastDataRead); - - ofs += size; - got += size; - want -= size; - } - - /* Disable flash internally */ - WriteDOC(0, docptr, Mplus_FlashSelect); - - ops->retlen = len; - return 0; -} - -static int doc_write_oob(struct mtd_info *mtd, loff_t ofs, - struct mtd_oob_ops *ops) -{ - volatile char dummy; - loff_t fofs, base; - struct DiskOnChip *this = mtd->priv; - void __iomem * docptr = this->virtadr; - struct Nand *mychip = &this->chips[ofs >> this->chipshift]; - size_t i, size, got, want; - int ret = 0; - uint8_t *buf = ops->oobbuf; - size_t len = ops->len; - - BUG_ON(ops->mode != MTD_OPS_PLACE_OOB); - - ofs += ops->ooboffs; - - DoC_CheckASIC(docptr); - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - /* Millennium Plus bus cycle sequence as per figure 2, section 2.4 */ - WriteDOC(DOC_FLASH_CE, docptr, Mplus_FlashSelect); - - - /* Maximum of 16 bytes in the OOB region, so limit write to that */ - if (len > 16) - len = 16; - got = 0; - want = len; - - for (i = 0; ((i < 3) && (want > 0)); i++) { - /* Reset the chip, see Software Requirement 11.4 item 1. */ - DoC_Command(docptr, NAND_CMD_RESET, 0); - DoC_WaitReady(docptr); - - /* Figure out which region we are accessing... */ - fofs = ofs; - base = ofs & 0x0f; - if (!this->interleave) { - WriteDOC(NAND_CMD_READOOB, docptr, Mplus_FlashCmd); - size = 16 - base; - } else if (base < 6) { - WriteDOC(DoC_GetECCOffset(mtd, &fofs), docptr, Mplus_FlashCmd); - size = 6 - base; - } else if (base < 8) { - WriteDOC(DoC_GetFlagsOffset(mtd, &fofs), docptr, Mplus_FlashCmd); - size = 8 - base; - } else { - WriteDOC(DoC_GetHdrOffset(mtd, &fofs), docptr, Mplus_FlashCmd); - size = 16 - base; - } - if (size > want) - size = want; - - /* Issue the Serial Data In command to initial the Page Program process */ - DoC_Command(docptr, NAND_CMD_SEQIN, 0x00); - DoC_Address(this, 3, fofs, 0, 0x00); - - /* Disable the ECC engine */ - WriteDOC(DOC_ECC_RESET, docptr, Mplus_ECCConf); - - /* Write the data via the internal pipeline through CDSN IO - register, see Pipelined Write Operations 11.2 */ - MemWriteDOC(docptr, (unsigned char *) &buf[got], size); - WriteDOC(0x00, docptr, Mplus_WritePipeTerm); - WriteDOC(0x00, docptr, Mplus_WritePipeTerm); - - /* Commit the Page Program command and wait for ready - see Software Requirement 11.4 item 1.*/ - DoC_Command(docptr, NAND_CMD_PAGEPROG, 0x00); - DoC_WaitReady(docptr); - - /* Read the status of the flash device through CDSN IO register - see Software Requirement 11.4 item 5.*/ - DoC_Command(docptr, NAND_CMD_STATUS, 0x00); - dummy = ReadDOC(docptr, Mplus_ReadPipeInit); - dummy = ReadDOC(docptr, Mplus_ReadPipeInit); - DoC_Delay(docptr, 2); - if ((dummy = ReadDOC(docptr, Mplus_LastDataRead)) & 1) { - printk("MTD: Error 0x%x programming oob at 0x%x\n", - dummy, (int)ofs); - /* FIXME: implement Bad Block Replacement */ - ops->retlen = 0; - ret = -EIO; - } - dummy = ReadDOC(docptr, Mplus_LastDataRead); - - ofs += size; - got += size; - want -= size; - } - - /* Disable flash internally */ - WriteDOC(0, docptr, Mplus_FlashSelect); - - ops->retlen = len; - return ret; -} - -int doc_erase(struct mtd_info *mtd, struct erase_info *instr) -{ - volatile char dummy; - struct DiskOnChip *this = mtd->priv; - __u32 ofs = instr->addr; - __u32 len = instr->len; - void __iomem * docptr = this->virtadr; - struct Nand *mychip = &this->chips[ofs >> this->chipshift]; - - DoC_CheckASIC(docptr); - - if (len != mtd->erasesize) - printk(KERN_WARNING "MTD: Erase not right size (%x != %x)n", - len, mtd->erasesize); - - /* Find the chip which is to be used and select it */ - if (this->curfloor != mychip->floor) { - DoC_SelectFloor(docptr, mychip->floor); - DoC_SelectChip(docptr, mychip->chip); - } else if (this->curchip != mychip->chip) { - DoC_SelectChip(docptr, mychip->chip); - } - this->curfloor = mychip->floor; - this->curchip = mychip->chip; - - instr->state = MTD_ERASE_PENDING; - - /* Millennium Plus bus cycle sequence as per figure 2, section 2.4 */ - WriteDOC(DOC_FLASH_CE, docptr, Mplus_FlashSelect); - - DoC_Command(docptr, NAND_CMD_RESET, 0x00); - DoC_WaitReady(docptr); - - DoC_Command(docptr, NAND_CMD_ERASE1, 0); - DoC_Address(this, 2, ofs, 0, 0x00); - DoC_Command(docptr, NAND_CMD_ERASE2, 0); - DoC_WaitReady(docptr); - instr->state = MTD_ERASING; - - /* Read the status of the flash device through CDSN IO register - see Software Requirement 11.4 item 5. */ - DoC_Command(docptr, NAND_CMD_STATUS, 0); - dummy = ReadDOC(docptr, Mplus_ReadPipeInit); - dummy = ReadDOC(docptr, Mplus_ReadPipeInit); - if ((dummy = ReadDOC(docptr, Mplus_LastDataRead)) & 1) { - printk("MTD: Error 0x%x erasing at 0x%x\n", dummy, ofs); - /* FIXME: implement Bad Block Replacement (in nftl.c ??) */ - instr->state = MTD_ERASE_FAILED; - } else { - instr->state = MTD_ERASE_DONE; - } - dummy = ReadDOC(docptr, Mplus_LastDataRead); - - /* Disable flash internally */ - WriteDOC(0, docptr, Mplus_FlashSelect); - - mtd_erase_callback(instr); - - return 0; -} - -/**************************************************************************** - * - * Module stuff - * - ****************************************************************************/ - -static void __exit cleanup_doc2001plus(void) -{ - struct mtd_info *mtd; - struct DiskOnChip *this; - - while ((mtd=docmilpluslist)) { - this = mtd->priv; - docmilpluslist = this->nextdoc; - - mtd_device_unregister(mtd); - - iounmap(this->virtadr); - kfree(this->chips); - kfree(mtd); - } -} - -module_exit(cleanup_doc2001plus); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Greg Ungerer et al."); -MODULE_DESCRIPTION("Driver for DiskOnChip Millennium Plus"); diff --git a/drivers/mtd/devices/docecc.c b/drivers/mtd/devices/docecc.c deleted file mode 100644 index 4a1c39b6f37d..000000000000 --- a/drivers/mtd/devices/docecc.c +++ /dev/null @@ -1,521 +0,0 @@ -/* - * ECC algorithm for M-systems disk on chip. We use the excellent Reed - * Solmon code of Phil Karn (karn@ka9q.ampr.org) available under the - * GNU GPL License. The rest is simply to convert the disk on chip - * syndrome into a standard syndome. - * - * Author: Fabrice Bellard (fabrice.bellard@netgem.com) - * Copyright (C) 2000 Netgem S.A. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include - -#define DEBUG_ECC 0 -/* need to undef it (from asm/termbits.h) */ -#undef B0 - -#define MM 10 /* Symbol size in bits */ -#define KK (1023-4) /* Number of data symbols per block */ -#define B0 510 /* First root of generator polynomial, alpha form */ -#define PRIM 1 /* power of alpha used to generate roots of generator poly */ -#define NN ((1 << MM) - 1) - -typedef unsigned short dtype; - -/* 1+x^3+x^10 */ -static const int Pp[MM+1] = { 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1 }; - -/* This defines the type used to store an element of the Galois Field - * used by the code. Make sure this is something larger than a char if - * if anything larger than GF(256) is used. - * - * Note: unsigned char will work up to GF(256) but int seems to run - * faster on the Pentium. - */ -typedef int gf; - -/* No legal value in index form represents zero, so - * we need a special value for this purpose - */ -#define A0 (NN) - -/* Compute x % NN, where NN is 2**MM - 1, - * without a slow divide - */ -static inline gf -modnn(int x) -{ - while (x >= NN) { - x -= NN; - x = (x >> MM) + (x & NN); - } - return x; -} - -#define CLEAR(a,n) {\ -int ci;\ -for(ci=(n)-1;ci >=0;ci--)\ -(a)[ci] = 0;\ -} - -#define COPY(a,b,n) {\ -int ci;\ -for(ci=(n)-1;ci >=0;ci--)\ -(a)[ci] = (b)[ci];\ -} - -#define COPYDOWN(a,b,n) {\ -int ci;\ -for(ci=(n)-1;ci >=0;ci--)\ -(a)[ci] = (b)[ci];\ -} - -#define Ldec 1 - -/* generate GF(2**m) from the irreducible polynomial p(X) in Pp[0]..Pp[m] - lookup tables: index->polynomial form alpha_to[] contains j=alpha**i; - polynomial form -> index form index_of[j=alpha**i] = i - alpha=2 is the primitive element of GF(2**m) - HARI's COMMENT: (4/13/94) alpha_to[] can be used as follows: - Let @ represent the primitive element commonly called "alpha" that - is the root of the primitive polynomial p(x). Then in GF(2^m), for any - 0 <= i <= 2^m-2, - @^i = a(0) + a(1) @ + a(2) @^2 + ... + a(m-1) @^(m-1) - where the binary vector (a(0),a(1),a(2),...,a(m-1)) is the representation - of the integer "alpha_to[i]" with a(0) being the LSB and a(m-1) the MSB. Thus for - example the polynomial representation of @^5 would be given by the binary - representation of the integer "alpha_to[5]". - Similarly, index_of[] can be used as follows: - As above, let @ represent the primitive element of GF(2^m) that is - the root of the primitive polynomial p(x). In order to find the power - of @ (alpha) that has the polynomial representation - a(0) + a(1) @ + a(2) @^2 + ... + a(m-1) @^(m-1) - we consider the integer "i" whose binary representation with a(0) being LSB - and a(m-1) MSB is (a(0),a(1),...,a(m-1)) and locate the entry - "index_of[i]". Now, @^index_of[i] is that element whose polynomial - representation is (a(0),a(1),a(2),...,a(m-1)). - NOTE: - The element alpha_to[2^m-1] = 0 always signifying that the - representation of "@^infinity" = 0 is (0,0,0,...,0). - Similarly, the element index_of[0] = A0 always signifying - that the power of alpha which has the polynomial representation - (0,0,...,0) is "infinity". - -*/ - -static void -generate_gf(dtype Alpha_to[NN + 1], dtype Index_of[NN + 1]) -{ - register int i, mask; - - mask = 1; - Alpha_to[MM] = 0; - for (i = 0; i < MM; i++) { - Alpha_to[i] = mask; - Index_of[Alpha_to[i]] = i; - /* If Pp[i] == 1 then, term @^i occurs in poly-repr of @^MM */ - if (Pp[i] != 0) - Alpha_to[MM] ^= mask; /* Bit-wise EXOR operation */ - mask <<= 1; /* single left-shift */ - } - Index_of[Alpha_to[MM]] = MM; - /* - * Have obtained poly-repr of @^MM. Poly-repr of @^(i+1) is given by - * poly-repr of @^i shifted left one-bit and accounting for any @^MM - * term that may occur when poly-repr of @^i is shifted. - */ - mask >>= 1; - for (i = MM + 1; i < NN; i++) { - if (Alpha_to[i - 1] >= mask) - Alpha_to[i] = Alpha_to[MM] ^ ((Alpha_to[i - 1] ^ mask) << 1); - else - Alpha_to[i] = Alpha_to[i - 1] << 1; - Index_of[Alpha_to[i]] = i; - } - Index_of[0] = A0; - Alpha_to[NN] = 0; -} - -/* - * Performs ERRORS+ERASURES decoding of RS codes. bb[] is the content - * of the feedback shift register after having processed the data and - * the ECC. - * - * Return number of symbols corrected, or -1 if codeword is illegal - * or uncorrectable. If eras_pos is non-null, the detected error locations - * are written back. NOTE! This array must be at least NN-KK elements long. - * The corrected data are written in eras_val[]. They must be xor with the data - * to retrieve the correct data : data[erase_pos[i]] ^= erase_val[i] . - * - * First "no_eras" erasures are declared by the calling program. Then, the - * maximum # of errors correctable is t_after_eras = floor((NN-KK-no_eras)/2). - * If the number of channel errors is not greater than "t_after_eras" the - * transmitted codeword will be recovered. Details of algorithm can be found - * in R. Blahut's "Theory ... of Error-Correcting Codes". - - * Warning: the eras_pos[] array must not contain duplicate entries; decoder failure - * will result. The decoder *could* check for this condition, but it would involve - * extra time on every decoding operation. - * */ -static int -eras_dec_rs(dtype Alpha_to[NN + 1], dtype Index_of[NN + 1], - gf bb[NN - KK + 1], gf eras_val[NN-KK], int eras_pos[NN-KK], - int no_eras) -{ - int deg_lambda, el, deg_omega; - int i, j, r,k; - gf u,q,tmp,num1,num2,den,discr_r; - gf lambda[NN-KK + 1], s[NN-KK + 1]; /* Err+Eras Locator poly - * and syndrome poly */ - gf b[NN-KK + 1], t[NN-KK + 1], omega[NN-KK + 1]; - gf root[NN-KK], reg[NN-KK + 1], loc[NN-KK]; - int syn_error, count; - - syn_error = 0; - for(i=0;i 0) { - /* Init lambda to be the erasure locator polynomial */ - lambda[1] = Alpha_to[modnn(PRIM * eras_pos[0])]; - for (i = 1; i < no_eras; i++) { - u = modnn(PRIM*eras_pos[i]); - for (j = i+1; j > 0; j--) { - tmp = Index_of[lambda[j - 1]]; - if(tmp != A0) - lambda[j] ^= Alpha_to[modnn(u + tmp)]; - } - } -#if DEBUG_ECC >= 1 - /* Test code that verifies the erasure locator polynomial just constructed - Needed only for decoder debugging. */ - - /* find roots of the erasure location polynomial */ - for(i=1;i<=no_eras;i++) - reg[i] = Index_of[lambda[i]]; - count = 0; - for (i = 1,k=NN-Ldec; i <= NN; i++,k = modnn(NN+k-Ldec)) { - q = 1; - for (j = 1; j <= no_eras; j++) - if (reg[j] != A0) { - reg[j] = modnn(reg[j] + j); - q ^= Alpha_to[reg[j]]; - } - if (q != 0) - continue; - /* store root and error location number indices */ - root[count] = i; - loc[count] = k; - count++; - } - if (count != no_eras) { - printf("\n lambda(x) is WRONG\n"); - count = -1; - goto finish; - } -#if DEBUG_ECC >= 2 - printf("\n Erasure positions as determined by roots of Eras Loc Poly:\n"); - for (i = 0; i < count; i++) - printf("%d ", loc[i]); - printf("\n"); -#endif -#endif - } - for(i=0;i 0; j--){ - if (reg[j] != A0) { - reg[j] = modnn(reg[j] + j); - q ^= Alpha_to[reg[j]]; - } - } - if (q != 0) - continue; - /* store root (index-form) and error location number */ - root[count] = i; - loc[count] = k; - /* If we've already found max possible roots, - * abort the search to save time - */ - if(++count == deg_lambda) - break; - } - if (deg_lambda != count) { - /* - * deg(lambda) unequal to number of roots => uncorrectable - * error detected - */ - count = -1; - goto finish; - } - /* - * Compute err+eras evaluator poly omega(x) = s(x)*lambda(x) (modulo - * x**(NN-KK)). in index form. Also find deg(omega). - */ - deg_omega = 0; - for (i = 0; i < NN-KK;i++){ - tmp = 0; - j = (deg_lambda < i) ? deg_lambda : i; - for(;j >= 0; j--){ - if ((s[i + 1 - j] != A0) && (lambda[j] != A0)) - tmp ^= Alpha_to[modnn(s[i + 1 - j] + lambda[j])]; - } - if(tmp != 0) - deg_omega = i; - omega[i] = Index_of[tmp]; - } - omega[NN-KK] = A0; - - /* - * Compute error values in poly-form. num1 = omega(inv(X(l))), num2 = - * inv(X(l))**(B0-1) and den = lambda_pr(inv(X(l))) all in poly-form - */ - for (j = count-1; j >=0; j--) { - num1 = 0; - for (i = deg_omega; i >= 0; i--) { - if (omega[i] != A0) - num1 ^= Alpha_to[modnn(omega[i] + i * root[j])]; - } - num2 = Alpha_to[modnn(root[j] * (B0 - 1) + NN)]; - den = 0; - - /* lambda[i+1] for i even is the formal derivative lambda_pr of lambda[i] */ - for (i = min(deg_lambda,NN-KK-1) & ~1; i >= 0; i -=2) { - if(lambda[i+1] != A0) - den ^= Alpha_to[modnn(lambda[i+1] + i * root[j])]; - } - if (den == 0) { -#if DEBUG_ECC >= 1 - printf("\n ERROR: denominator = 0\n"); -#endif - /* Convert to dual- basis */ - count = -1; - goto finish; - } - /* Apply error to data */ - if (num1 != 0) { - eras_val[j] = Alpha_to[modnn(Index_of[num1] + Index_of[num2] + NN - Index_of[den])]; - } else { - eras_val[j] = 0; - } - } - finish: - for(i=0;i> 2) | ((ecc1[2] & 0x0f) << 6); - bb[2] = ((ecc1[2] & 0xf0) >> 4) | ((ecc1[3] & 0x3f) << 4); - bb[3] = ((ecc1[3] & 0xc0) >> 6) | ((ecc1[0] & 0xff) << 2); - - nb_errors = eras_dec_rs(Alpha_to, Index_of, bb, - error_val, error_pos, 0); - if (nb_errors <= 0) - goto the_end; - - /* correct the errors */ - for(i=0;i= NB_DATA && pos < KK) { - nb_errors = -1; - goto the_end; - } - if (pos < NB_DATA) { - /* extract bit position (MSB first) */ - pos = 10 * (NB_DATA - 1 - pos) - 6; - /* now correct the following 10 bits. At most two bytes - can be modified since pos is even */ - index = (pos >> 3) ^ 1; - bitpos = pos & 7; - if ((index >= 0 && index < SECTOR_SIZE) || - index == (SECTOR_SIZE + 1)) { - val = error_val[i] >> (2 + bitpos); - parity ^= val; - if (index < SECTOR_SIZE) - sector[index] ^= val; - } - index = ((pos >> 3) + 1) ^ 1; - bitpos = (bitpos + 10) & 7; - if (bitpos == 0) - bitpos = 8; - if ((index >= 0 && index < SECTOR_SIZE) || - index == (SECTOR_SIZE + 1)) { - val = error_val[i] << (8 - bitpos); - parity ^= val; - if (index < SECTOR_SIZE) - sector[index] ^= val; - } - } - } - - /* use parity to test extra errors */ - if ((parity & 0xff) != 0) - nb_errors = -1; - - the_end: - kfree(Alpha_to); - kfree(Index_of); - return nb_errors; -} - -EXPORT_SYMBOL_GPL(doc_decode_ecc); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Fabrice Bellard "); -MODULE_DESCRIPTION("ECC code for correcting errors detected by DiskOnChip 2000 and Millennium ECC hardware"); diff --git a/drivers/mtd/devices/docg3.c b/drivers/mtd/devices/docg3.c index 8510ccb9c6f0..3e1b0a0ef4db 100644 --- a/drivers/mtd/devices/docg3.c +++ b/drivers/mtd/devices/docg3.c @@ -123,7 +123,7 @@ static inline void doc_flash_address(struct docg3 *docg3, u8 addr) doc_writeb(docg3, addr, DOC_FLASHADDRESS); } -static char const *part_probes[] = { "cmdlinepart", "saftlpart", NULL }; +static char const * const part_probes[] = { "cmdlinepart", "saftlpart", NULL }; static int doc_register_readb(struct docg3 *docg3, int reg) { @@ -2144,18 +2144,7 @@ static struct platform_driver g3_driver = { .remove = __exit_p(docg3_release), }; -static int __init docg3_init(void) -{ - return platform_driver_probe(&g3_driver, docg3_probe); -} -module_init(docg3_init); - - -static void __exit docg3_exit(void) -{ - platform_driver_unregister(&g3_driver); -} -module_exit(docg3_exit); +module_platform_driver_probe(g3_driver, docg3_probe); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Robert Jarzmik "); diff --git a/drivers/mtd/devices/docprobe.c b/drivers/mtd/devices/docprobe.c deleted file mode 100644 index 88b3fd3e18a7..000000000000 --- a/drivers/mtd/devices/docprobe.c +++ /dev/null @@ -1,325 +0,0 @@ - -/* Linux driver for Disk-On-Chip devices */ -/* Probe routines common to all DoC devices */ -/* (C) 1999 Machine Vision Holdings, Inc. */ -/* (C) 1999-2003 David Woodhouse */ - - -/* DOC_PASSIVE_PROBE: - In order to ensure that the BIOS checksum is correct at boot time, and - hence that the onboard BIOS extension gets executed, the DiskOnChip - goes into reset mode when it is read sequentially: all registers - return 0xff until the chip is woken up again by writing to the - DOCControl register. - - Unfortunately, this means that the probe for the DiskOnChip is unsafe, - because one of the first things it does is write to where it thinks - the DOCControl register should be - which may well be shared memory - for another device. I've had machines which lock up when this is - attempted. Hence the possibility to do a passive probe, which will fail - to detect a chip in reset mode, but is at least guaranteed not to lock - the machine. - - If you have this problem, uncomment the following line: -#define DOC_PASSIVE_PROBE -*/ - - -/* DOC_SINGLE_DRIVER: - Millennium driver has been merged into DOC2000 driver. - - The old Millennium-only driver has been retained just in case there - are problems with the new code. If the combined driver doesn't work - for you, you can try the old one by undefining DOC_SINGLE_DRIVER - below and also enabling it in your configuration. If this fixes the - problems, please send a report to the MTD mailing list at - . -*/ -#define DOC_SINGLE_DRIVER - -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include - - -static unsigned long doc_config_location = CONFIG_MTD_DOCPROBE_ADDRESS; -module_param(doc_config_location, ulong, 0); -MODULE_PARM_DESC(doc_config_location, "Physical memory address at which to probe for DiskOnChip"); - -static unsigned long __initdata doc_locations[] = { -#if defined (__alpha__) || defined(__i386__) || defined(__x86_64__) -#ifdef CONFIG_MTD_DOCPROBE_HIGH - 0xfffc8000, 0xfffca000, 0xfffcc000, 0xfffce000, - 0xfffd0000, 0xfffd2000, 0xfffd4000, 0xfffd6000, - 0xfffd8000, 0xfffda000, 0xfffdc000, 0xfffde000, - 0xfffe0000, 0xfffe2000, 0xfffe4000, 0xfffe6000, - 0xfffe8000, 0xfffea000, 0xfffec000, 0xfffee000, -#else /* CONFIG_MTD_DOCPROBE_HIGH */ - 0xc8000, 0xca000, 0xcc000, 0xce000, - 0xd0000, 0xd2000, 0xd4000, 0xd6000, - 0xd8000, 0xda000, 0xdc000, 0xde000, - 0xe0000, 0xe2000, 0xe4000, 0xe6000, - 0xe8000, 0xea000, 0xec000, 0xee000, -#endif /* CONFIG_MTD_DOCPROBE_HIGH */ -#endif - 0xffffffff }; - -/* doccheck: Probe a given memory window to see if there's a DiskOnChip present */ - -static inline int __init doccheck(void __iomem *potential, unsigned long physadr) -{ - void __iomem *window=potential; - unsigned char tmp, tmpb, tmpc, ChipID; -#ifndef DOC_PASSIVE_PROBE - unsigned char tmp2; -#endif - - /* Routine copied from the Linux DOC driver */ - -#ifdef CONFIG_MTD_DOCPROBE_55AA - /* Check for 0x55 0xAA signature at beginning of window, - this is no longer true once we remove the IPL (for Millennium */ - if (ReadDOC(window, Sig1) != 0x55 || ReadDOC(window, Sig2) != 0xaa) - return 0; -#endif /* CONFIG_MTD_DOCPROBE_55AA */ - -#ifndef DOC_PASSIVE_PROBE - /* It's not possible to cleanly detect the DiskOnChip - the - * bootup procedure will put the device into reset mode, and - * it's not possible to talk to it without actually writing - * to the DOCControl register. So we store the current contents - * of the DOCControl register's location, in case we later decide - * that it's not a DiskOnChip, and want to put it back how we - * found it. - */ - tmp2 = ReadDOC(window, DOCControl); - - /* Reset the DiskOnChip ASIC */ - WriteDOC(DOC_MODE_CLR_ERR | DOC_MODE_MDWREN | DOC_MODE_RESET, - window, DOCControl); - WriteDOC(DOC_MODE_CLR_ERR | DOC_MODE_MDWREN | DOC_MODE_RESET, - window, DOCControl); - - /* Enable the DiskOnChip ASIC */ - WriteDOC(DOC_MODE_CLR_ERR | DOC_MODE_MDWREN | DOC_MODE_NORMAL, - window, DOCControl); - WriteDOC(DOC_MODE_CLR_ERR | DOC_MODE_MDWREN | DOC_MODE_NORMAL, - window, DOCControl); -#endif /* !DOC_PASSIVE_PROBE */ - - /* We need to read the ChipID register four times. For some - newer DiskOnChip 2000 units, the first three reads will - return the DiskOnChip Millennium ident. Don't ask. */ - ChipID = ReadDOC(window, ChipID); - - switch (ChipID) { - case DOC_ChipID_Doc2k: - /* Check the TOGGLE bit in the ECC register */ - tmp = ReadDOC(window, 2k_ECCStatus) & DOC_TOGGLE_BIT; - tmpb = ReadDOC(window, 2k_ECCStatus) & DOC_TOGGLE_BIT; - tmpc = ReadDOC(window, 2k_ECCStatus) & DOC_TOGGLE_BIT; - if (tmp != tmpb && tmp == tmpc) - return ChipID; - break; - - case DOC_ChipID_DocMil: - /* Check for the new 2000 with Millennium ASIC */ - ReadDOC(window, ChipID); - ReadDOC(window, ChipID); - if (ReadDOC(window, ChipID) != DOC_ChipID_DocMil) - ChipID = DOC_ChipID_Doc2kTSOP; - - /* Check the TOGGLE bit in the ECC register */ - tmp = ReadDOC(window, ECCConf) & DOC_TOGGLE_BIT; - tmpb = ReadDOC(window, ECCConf) & DOC_TOGGLE_BIT; - tmpc = ReadDOC(window, ECCConf) & DOC_TOGGLE_BIT; - if (tmp != tmpb && tmp == tmpc) - return ChipID; - break; - - case DOC_ChipID_DocMilPlus16: - case DOC_ChipID_DocMilPlus32: - case 0: - /* Possible Millennium+, need to do more checks */ -#ifndef DOC_PASSIVE_PROBE - /* Possibly release from power down mode */ - for (tmp = 0; (tmp < 4); tmp++) - ReadDOC(window, Mplus_Power); - - /* Reset the DiskOnChip ASIC */ - tmp = DOC_MODE_RESET | DOC_MODE_MDWREN | DOC_MODE_RST_LAT | - DOC_MODE_BDECT; - WriteDOC(tmp, window, Mplus_DOCControl); - WriteDOC(~tmp, window, Mplus_CtrlConfirm); - - mdelay(1); - /* Enable the DiskOnChip ASIC */ - tmp = DOC_MODE_NORMAL | DOC_MODE_MDWREN | DOC_MODE_RST_LAT | - DOC_MODE_BDECT; - WriteDOC(tmp, window, Mplus_DOCControl); - WriteDOC(~tmp, window, Mplus_CtrlConfirm); - mdelay(1); -#endif /* !DOC_PASSIVE_PROBE */ - - ChipID = ReadDOC(window, ChipID); - - switch (ChipID) { - case DOC_ChipID_DocMilPlus16: - case DOC_ChipID_DocMilPlus32: - /* Check the TOGGLE bit in the toggle register */ - tmp = ReadDOC(window, Mplus_Toggle) & DOC_TOGGLE_BIT; - tmpb = ReadDOC(window, Mplus_Toggle) & DOC_TOGGLE_BIT; - tmpc = ReadDOC(window, Mplus_Toggle) & DOC_TOGGLE_BIT; - if (tmp != tmpb && tmp == tmpc) - return ChipID; - default: - break; - } - /* FALL TRHU */ - - default: - -#ifdef CONFIG_MTD_DOCPROBE_55AA - printk(KERN_DEBUG "Possible DiskOnChip with unknown ChipID %2.2X found at 0x%lx\n", - ChipID, physadr); -#endif -#ifndef DOC_PASSIVE_PROBE - /* Put back the contents of the DOCControl register, in case it's not - * actually a DiskOnChip. - */ - WriteDOC(tmp2, window, DOCControl); -#endif - return 0; - } - - printk(KERN_WARNING "DiskOnChip failed TOGGLE test, dropping.\n"); - -#ifndef DOC_PASSIVE_PROBE - /* Put back the contents of the DOCControl register: it's not a DiskOnChip */ - WriteDOC(tmp2, window, DOCControl); -#endif - return 0; -} - -static int docfound; - -extern void DoC2k_init(struct mtd_info *); -extern void DoCMil_init(struct mtd_info *); -extern void DoCMilPlus_init(struct mtd_info *); - -static void __init DoC_Probe(unsigned long physadr) -{ - void __iomem *docptr; - struct DiskOnChip *this; - struct mtd_info *mtd; - int ChipID; - char namebuf[15]; - char *name = namebuf; - void (*initroutine)(struct mtd_info *) = NULL; - - docptr = ioremap(physadr, DOC_IOREMAP_LEN); - - if (!docptr) - return; - - if ((ChipID = doccheck(docptr, physadr))) { - if (ChipID == DOC_ChipID_Doc2kTSOP) { - /* Remove this at your own peril. The hardware driver works but nothing prevents you from erasing bad blocks */ - printk(KERN_NOTICE "Refusing to drive DiskOnChip 2000 TSOP until Bad Block Table is correctly supported by INFTL\n"); - iounmap(docptr); - return; - } - docfound = 1; - mtd = kzalloc(sizeof(struct DiskOnChip) + sizeof(struct mtd_info), GFP_KERNEL); - if (!mtd) { - printk(KERN_WARNING "Cannot allocate memory for data structures. Dropping.\n"); - iounmap(docptr); - return; - } - - this = (struct DiskOnChip *)(&mtd[1]); - mtd->priv = this; - this->virtadr = docptr; - this->physadr = physadr; - this->ChipID = ChipID; - sprintf(namebuf, "with ChipID %2.2X", ChipID); - - switch(ChipID) { - case DOC_ChipID_Doc2kTSOP: - name="2000 TSOP"; - initroutine = symbol_request(DoC2k_init); - break; - - case DOC_ChipID_Doc2k: - name="2000"; - initroutine = symbol_request(DoC2k_init); - break; - - case DOC_ChipID_DocMil: - name="Millennium"; -#ifdef DOC_SINGLE_DRIVER - initroutine = symbol_request(DoC2k_init); -#else - initroutine = symbol_request(DoCMil_init); -#endif /* DOC_SINGLE_DRIVER */ - break; - - case DOC_ChipID_DocMilPlus16: - case DOC_ChipID_DocMilPlus32: - name="MillenniumPlus"; - initroutine = symbol_request(DoCMilPlus_init); - break; - } - - if (initroutine) { - (*initroutine)(mtd); - symbol_put_addr(initroutine); - return; - } - printk(KERN_NOTICE "Cannot find driver for DiskOnChip %s at 0x%lX\n", name, physadr); - kfree(mtd); - } - iounmap(docptr); -} - - -/**************************************************************************** - * - * Module stuff - * - ****************************************************************************/ - -static int __init init_doc(void) -{ - int i; - - if (doc_config_location) { - printk(KERN_INFO "Using configured DiskOnChip probe address 0x%lx\n", doc_config_location); - DoC_Probe(doc_config_location); - } else { - for (i=0; (doc_locations[i] != 0xffffffff); i++) { - DoC_Probe(doc_locations[i]); - } - } - /* No banner message any more. Print a message if no DiskOnChip - found, so the user knows we at least tried. */ - if (!docfound) - printk(KERN_INFO "No recognised DiskOnChip devices found\n"); - return -EAGAIN; -} - -module_init(init_doc); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("David Woodhouse "); -MODULE_DESCRIPTION("Probe code for DiskOnChip 2000 and Millennium devices"); - diff --git a/drivers/mtd/devices/elm.c b/drivers/mtd/devices/elm.c index 2ec5da9ee248..dccef9fdc1f2 100644 --- a/drivers/mtd/devices/elm.c +++ b/drivers/mtd/devices/elm.c @@ -81,14 +81,21 @@ static u32 elm_read_reg(struct elm_info *info, int offset) * @dev: ELM device * @bch_type: Type of BCH ecc */ -void elm_config(struct device *dev, enum bch_ecc bch_type) +int elm_config(struct device *dev, enum bch_ecc bch_type) { u32 reg_val; struct elm_info *info = dev_get_drvdata(dev); + if (!info) { + dev_err(dev, "Unable to configure elm - device not probed?\n"); + return -ENODEV; + } + reg_val = (bch_type & ECC_BCH_LEVEL_MASK) | (ELM_ECC_SIZE << 16); elm_write_reg(info, ELM_LOCATION_CONFIG, reg_val); info->bch_type = bch_type; + + return 0; } EXPORT_SYMBOL(elm_config); diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c index 5b6b0728be21..2f3d2a5ff349 100644 --- a/drivers/mtd/devices/m25p80.c +++ b/drivers/mtd/devices/m25p80.c @@ -681,6 +681,7 @@ struct flash_info { u16 flags; #define SECT_4K 0x01 /* OPCODE_BE_4K works uniformly */ #define M25P_NO_ERASE 0x02 /* No erase command needed */ +#define SST_WRITE 0x04 /* use SST byte programming */ }; #define INFO(_jedec_id, _ext_id, _sector_size, _n_sectors, _flags) \ @@ -728,6 +729,7 @@ static const struct spi_device_id m25p_ids[] = { { "en25q32b", INFO(0x1c3016, 0, 64 * 1024, 64, 0) }, { "en25p64", INFO(0x1c2017, 0, 64 * 1024, 128, 0) }, { "en25q64", INFO(0x1c3017, 0, 64 * 1024, 128, SECT_4K) }, + { "en25qh256", INFO(0x1c7019, 0, 64 * 1024, 512, 0) }, /* Everspin */ { "mr25h256", CAT25_INFO( 32 * 1024, 1, 256, 2) }, @@ -740,7 +742,6 @@ static const struct spi_device_id m25p_ids[] = { { "160s33b", INFO(0x898911, 0, 64 * 1024, 32, 0) }, { "320s33b", INFO(0x898912, 0, 64 * 1024, 64, 0) }, { "640s33b", INFO(0x898913, 0, 64 * 1024, 128, 0) }, - { "n25q064", INFO(0x20ba17, 0, 64 * 1024, 128, 0) }, /* Macronix */ { "mx25l2005a", INFO(0xc22012, 0, 64 * 1024, 4, SECT_4K) }, @@ -753,8 +754,10 @@ static const struct spi_device_id m25p_ids[] = { { "mx25l12855e", INFO(0xc22618, 0, 64 * 1024, 256, 0) }, { "mx25l25635e", INFO(0xc22019, 0, 64 * 1024, 512, 0) }, { "mx25l25655e", INFO(0xc22619, 0, 64 * 1024, 512, 0) }, + { "mx66l51235l", INFO(0xc2201a, 0, 64 * 1024, 1024, 0) }, /* Micron */ + { "n25q064", INFO(0x20ba17, 0, 64 * 1024, 128, 0) }, { "n25q128a11", INFO(0x20bb18, 0, 64 * 1024, 256, 0) }, { "n25q128a13", INFO(0x20ba18, 0, 64 * 1024, 256, 0) }, { "n25q256a", INFO(0x20ba19, 0, 64 * 1024, 512, SECT_4K) }, @@ -781,14 +784,15 @@ static const struct spi_device_id m25p_ids[] = { { "s25fl064k", INFO(0xef4017, 0, 64 * 1024, 128, SECT_4K) }, /* SST -- large erase sizes are "overlays", "sectors" are 4K */ - { "sst25vf040b", INFO(0xbf258d, 0, 64 * 1024, 8, SECT_4K) }, - { "sst25vf080b", INFO(0xbf258e, 0, 64 * 1024, 16, SECT_4K) }, - { "sst25vf016b", INFO(0xbf2541, 0, 64 * 1024, 32, SECT_4K) }, - { "sst25vf032b", INFO(0xbf254a, 0, 64 * 1024, 64, SECT_4K) }, - { "sst25wf512", INFO(0xbf2501, 0, 64 * 1024, 1, SECT_4K) }, - { "sst25wf010", INFO(0xbf2502, 0, 64 * 1024, 2, SECT_4K) }, - { "sst25wf020", INFO(0xbf2503, 0, 64 * 1024, 4, SECT_4K) }, - { "sst25wf040", INFO(0xbf2504, 0, 64 * 1024, 8, SECT_4K) }, + { "sst25vf040b", INFO(0xbf258d, 0, 64 * 1024, 8, SECT_4K | SST_WRITE) }, + { "sst25vf080b", INFO(0xbf258e, 0, 64 * 1024, 16, SECT_4K | SST_WRITE) }, + { "sst25vf016b", INFO(0xbf2541, 0, 64 * 1024, 32, SECT_4K | SST_WRITE) }, + { "sst25vf032b", INFO(0xbf254a, 0, 64 * 1024, 64, SECT_4K | SST_WRITE) }, + { "sst25vf064c", INFO(0xbf254b, 0, 64 * 1024, 128, SECT_4K) }, + { "sst25wf512", INFO(0xbf2501, 0, 64 * 1024, 1, SECT_4K | SST_WRITE) }, + { "sst25wf010", INFO(0xbf2502, 0, 64 * 1024, 2, SECT_4K | SST_WRITE) }, + { "sst25wf020", INFO(0xbf2503, 0, 64 * 1024, 4, SECT_4K | SST_WRITE) }, + { "sst25wf040", INFO(0xbf2504, 0, 64 * 1024, 8, SECT_4K | SST_WRITE) }, /* ST Microelectronics -- newer production may have feature updates */ { "m25p05", INFO(0x202010, 0, 32 * 1024, 2, 0) }, @@ -838,6 +842,7 @@ static const struct spi_device_id m25p_ids[] = { { "w25q64", INFO(0xef4017, 0, 64 * 1024, 128, SECT_4K) }, { "w25q80", INFO(0xef5014, 0, 64 * 1024, 16, SECT_4K) }, { "w25q80bl", INFO(0xef4014, 0, 64 * 1024, 16, SECT_4K) }, + { "w25q128", INFO(0xef4018, 0, 64 * 1024, 256, SECT_4K) }, { "w25q256", INFO(0xef4019, 0, 64 * 1024, 512, SECT_4K) }, /* Catalyst / On Semiconductor -- non-JEDEC */ @@ -1000,7 +1005,7 @@ static int m25p_probe(struct spi_device *spi) } /* sst flash chips use AAI word program */ - if (JEDEC_MFR(info->jedec_id) == CFI_MFR_SST) + if (info->flags & SST_WRITE) flash->mtd._write = sst_write; else flash->mtd._write = m25p80_write; diff --git a/drivers/mtd/devices/mtd_dataflash.c b/drivers/mtd/devices/mtd_dataflash.c index 945c9f762349..28779b6dfcd9 100644 --- a/drivers/mtd/devices/mtd_dataflash.c +++ b/drivers/mtd/devices/mtd_dataflash.c @@ -105,8 +105,6 @@ static const struct of_device_id dataflash_dt_ids[] = { { .compatible = "atmel,dataflash", }, { /* sentinel */ } }; -#else -#define dataflash_dt_ids NULL #endif /* ......................................................................... */ @@ -914,7 +912,7 @@ static struct spi_driver dataflash_driver = { .driver = { .name = "mtd_dataflash", .owner = THIS_MODULE, - .of_match_table = dataflash_dt_ids, + .of_match_table = of_match_ptr(dataflash_dt_ids), }, .probe = dataflash_probe, diff --git a/drivers/mtd/maps/Kconfig b/drivers/mtd/maps/Kconfig index 3ed17c4d4358..bed9d58d5741 100644 --- a/drivers/mtd/maps/Kconfig +++ b/drivers/mtd/maps/Kconfig @@ -249,22 +249,6 @@ config MTD_LANTIQ help Support for NOR flash attached to the Lantiq SoC's External Bus Unit. -config MTD_DILNETPC - tristate "CFI Flash device mapped on DIL/Net PC" - depends on X86 && MTD_CFI_INTELEXT && BROKEN - help - MTD map driver for SSV DIL/Net PC Boards "DNP" and "ADNP". - For details, see - and - -config MTD_DILNETPC_BOOTSIZE - hex "Size of DIL/Net PC flash boot partition" - depends on MTD_DILNETPC - default "0x80000" - help - The amount of space taken up by the kernel or Etherboot - on the DIL/Net PC flash chips. - config MTD_L440GX tristate "BIOS flash chip on Intel L440GX boards" depends on X86 && MTD_JEDECPROBE @@ -274,42 +258,6 @@ config MTD_L440GX BE VERY CAREFUL. -config MTD_TQM8XXL - tristate "CFI Flash device mapped on TQM8XXL" - depends on MTD_CFI && TQM8xxL - help - The TQM8xxL PowerPC board has up to two banks of CFI-compliant - chips, currently uses AMD one. This 'mapping' driver supports - that arrangement, allowing the CFI probe and command set driver - code to communicate with the chips on the TQM8xxL board. More at - . - -config MTD_RPXLITE - tristate "CFI Flash device mapped on RPX Lite or CLLF" - depends on MTD_CFI && (RPXCLASSIC || RPXLITE) - help - The RPXLite PowerPC board has CFI-compliant chips mapped in - a strange sparse mapping. This 'mapping' driver supports that - arrangement, allowing the CFI probe and command set driver code - to communicate with the chips on the RPXLite board. More at - . - -config MTD_MBX860 - tristate "System flash on MBX860 board" - depends on MTD_CFI && MBX - help - This enables access routines for the flash chips on the Motorola - MBX860 board. If you have one of these boards and would like - to use the flash chips on it, say 'Y'. - -config MTD_DBOX2 - tristate "CFI Flash device mapped on D-Box2" - depends on DBOX2 && MTD_CFI_INTELSTD && MTD_CFI_INTELEXT && MTD_CFI_AMDSTD - help - This enables access routines for the flash chips on the Nokia/Sagem - D-Box 2 board. If you have one of these boards and would like to use - the flash chips on it, say 'Y'. - config MTD_CFI_FLAGADM tristate "CFI Flash device mapping on FlagaDM" depends on 8xx && MTD_CFI @@ -349,15 +297,6 @@ config MTD_IXP4XX IXDP425 and Coyote. If you have an IXP4xx based board and would like to use the flash chips on it, say 'Y'. -config MTD_IXP2000 - tristate "CFI Flash device mapped on Intel IXP2000 based systems" - depends on MTD_CFI && MTD_COMPLEX_MAPPINGS && ARCH_IXP2000 - help - This enables MTD access to flash devices on platforms based - on Intel's IXP2000 family of network processors. If you have an - IXP2000 based board and would like to use the flash chips on it, - say 'Y'. - config MTD_AUTCPU12 bool "NV-RAM mapping AUTCPU12 board" depends on ARCH_AUTCPU12 @@ -372,13 +311,6 @@ config MTD_IMPA7 This enables access to the NOR Flash on the impA7 board of implementa GmbH. If you have such a board, say 'Y' here. -config MTD_H720X - tristate "Hynix evaluation board mappings" - depends on MTD_CFI && ( ARCH_H7201 || ARCH_H7202 ) - help - This enables access to the flash chips on the Hynix evaluation boards. - If you have such a board, say 'Y'. - # This needs CFI or JEDEC, depending on the cards found. config MTD_PCI tristate "PCI MTD driver" @@ -419,7 +351,7 @@ config MTD_BFIN_ASYNC config MTD_GPIO_ADDR tristate "GPIO-assisted Flash Chip Support" - depends on GENERIC_GPIO || GPIOLIB + depends on GPIOLIB depends on MTD_COMPLEX_MAPPINGS help Map driver which allows flashes to be partially physically addressed @@ -433,15 +365,6 @@ config MTD_UCLINUX help Map driver to support image based filesystems for uClinux. -config MTD_DMV182 - tristate "Map driver for Dy-4 SVME/DMV-182 board." - depends on DMV182 - select MTD_MAP_BANK_WIDTH_32 - select MTD_CFI_I8 - select MTD_CFI_AMDSTD - help - Map driver for Dy-4 SVME/DMV-182 board. - config MTD_INTEL_VR_NOR tristate "NOR flash on Intel Vermilion Range Expansion Bus CS0" depends on PCI diff --git a/drivers/mtd/maps/Makefile b/drivers/mtd/maps/Makefile index 4ded28711bc1..395a12444048 100644 --- a/drivers/mtd/maps/Makefile +++ b/drivers/mtd/maps/Makefile @@ -9,7 +9,6 @@ endif # Chip mappings obj-$(CONFIG_MTD_CFI_FLAGADM) += cfi_flagadm.o obj-$(CONFIG_MTD_DC21285) += dc21285.o -obj-$(CONFIG_MTD_DILNETPC) += dilnetpc.o obj-$(CONFIG_MTD_L440GX) += l440gx.o obj-$(CONFIG_MTD_AMD76XROM) += amd76xrom.o obj-$(CONFIG_MTD_ESB2ROM) += esb2rom.o @@ -17,15 +16,12 @@ obj-$(CONFIG_MTD_ICHXROM) += ichxrom.o obj-$(CONFIG_MTD_CK804XROM) += ck804xrom.o obj-$(CONFIG_MTD_TSUNAMI) += tsunami_flash.o obj-$(CONFIG_MTD_PXA2XX) += pxa2xx-flash.o -obj-$(CONFIG_MTD_MBX860) += mbx860.o obj-$(CONFIG_MTD_OCTAGON) += octagon-5066.o obj-$(CONFIG_MTD_PHYSMAP) += physmap.o obj-$(CONFIG_MTD_PHYSMAP_OF) += physmap_of.o obj-$(CONFIG_MTD_PISMO) += pismo.o obj-$(CONFIG_MTD_PMC_MSP_EVM) += pmcmsp-flash.o obj-$(CONFIG_MTD_PCMCIA) += pcmciamtd.o -obj-$(CONFIG_MTD_RPXLITE) += rpxlite.o -obj-$(CONFIG_MTD_TQM8XXL) += tqm8xxl.o obj-$(CONFIG_MTD_SA1100) += sa1100-flash.o obj-$(CONFIG_MTD_SBC_GXX) += sbc_gxx.o obj-$(CONFIG_MTD_SC520CDP) += sc520cdp.o @@ -34,7 +30,6 @@ obj-$(CONFIG_MTD_TS5500) += ts5500_flash.o obj-$(CONFIG_MTD_SUN_UFLASH) += sun_uflash.o obj-$(CONFIG_MTD_VMAX) += vmax301.o obj-$(CONFIG_MTD_SCx200_DOCFLASH)+= scx200_docflash.o -obj-$(CONFIG_MTD_DBOX2) += dbox2-flash.o obj-$(CONFIG_MTD_SOLUTIONENGINE)+= solutionengine.o obj-$(CONFIG_MTD_PCI) += pci.o obj-$(CONFIG_MTD_AUTCPU12) += autcpu12-nvram.o @@ -42,10 +37,7 @@ obj-$(CONFIG_MTD_IMPA7) += impa7.o obj-$(CONFIG_MTD_UCLINUX) += uclinux.o obj-$(CONFIG_MTD_NETtel) += nettel.o obj-$(CONFIG_MTD_SCB2_FLASH) += scb2_flash.o -obj-$(CONFIG_MTD_H720X) += h720x-flash.o obj-$(CONFIG_MTD_IXP4XX) += ixp4xx.o -obj-$(CONFIG_MTD_IXP2000) += ixp2000.o -obj-$(CONFIG_MTD_DMV182) += dmv182.o obj-$(CONFIG_MTD_PLATRAM) += plat-ram.o obj-$(CONFIG_MTD_INTEL_VR_NOR) += intel_vr_nor.o obj-$(CONFIG_MTD_BFIN_ASYNC) += bfin-async-flash.o diff --git a/drivers/mtd/maps/bfin-async-flash.c b/drivers/mtd/maps/bfin-async-flash.c index f833edfaab79..319b04a6c9d1 100644 --- a/drivers/mtd/maps/bfin-async-flash.c +++ b/drivers/mtd/maps/bfin-async-flash.c @@ -122,7 +122,8 @@ static void bfin_flash_copy_to(struct map_info *map, unsigned long to, const voi switch_back(state); } -static const char *part_probe_types[] = { "cmdlinepart", "RedBoot", NULL }; +static const char * const part_probe_types[] = { + "cmdlinepart", "RedBoot", NULL }; static int bfin_flash_probe(struct platform_device *pdev) { diff --git a/drivers/mtd/maps/ck804xrom.c b/drivers/mtd/maps/ck804xrom.c index 586a1c77e48a..0455166f05fa 100644 --- a/drivers/mtd/maps/ck804xrom.c +++ b/drivers/mtd/maps/ck804xrom.c @@ -308,8 +308,7 @@ static int ck804xrom_init_one(struct pci_dev *pdev, out: /* Free any left over map structures */ - if (map) - kfree(map); + kfree(map); /* See if I have any map structures */ if (list_empty(&window->maps)) { diff --git a/drivers/mtd/maps/dbox2-flash.c b/drivers/mtd/maps/dbox2-flash.c deleted file mode 100644 index 85bdece6ab3f..000000000000 --- a/drivers/mtd/maps/dbox2-flash.c +++ /dev/null @@ -1,123 +0,0 @@ -/* - * D-Box 2 flash driver - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* partition_info gives details on the logical partitions that the split the - * single flash device into. If the size if zero we use up to the end of the - * device. */ -static struct mtd_partition partition_info[]= { - { - .name = "BR bootloader", - .size = 128 * 1024, - .offset = 0, - .mask_flags = MTD_WRITEABLE - }, - { - .name = "FLFS (U-Boot)", - .size = 128 * 1024, - .offset = MTDPART_OFS_APPEND, - .mask_flags = 0 - }, - { - .name = "Root (SquashFS)", - .size = 7040 * 1024, - .offset = MTDPART_OFS_APPEND, - .mask_flags = 0 - }, - { - .name = "var (JFFS2)", - .size = 896 * 1024, - .offset = MTDPART_OFS_APPEND, - .mask_flags = 0 - }, - { - .name = "Flash without bootloader", - .size = MTDPART_SIZ_FULL, - .offset = 128 * 1024, - .mask_flags = 0 - }, - { - .name = "Complete Flash", - .size = MTDPART_SIZ_FULL, - .offset = 0, - .mask_flags = MTD_WRITEABLE - } -}; - -#define NUM_PARTITIONS ARRAY_SIZE(partition_info) - -#define WINDOW_ADDR 0x10000000 -#define WINDOW_SIZE 0x800000 - -static struct mtd_info *mymtd; - - -struct map_info dbox2_flash_map = { - .name = "D-Box 2 flash memory", - .size = WINDOW_SIZE, - .bankwidth = 4, - .phys = WINDOW_ADDR, -}; - -static int __init init_dbox2_flash(void) -{ - printk(KERN_NOTICE "D-Box 2 flash driver (size->0x%X mem->0x%X)\n", WINDOW_SIZE, WINDOW_ADDR); - dbox2_flash_map.virt = ioremap(WINDOW_ADDR, WINDOW_SIZE); - - if (!dbox2_flash_map.virt) { - printk("Failed to ioremap\n"); - return -EIO; - } - simple_map_init(&dbox2_flash_map); - - // Probe for dual Intel 28F320 or dual AMD - mymtd = do_map_probe("cfi_probe", &dbox2_flash_map); - if (!mymtd) { - // Probe for single Intel 28F640 - dbox2_flash_map.bankwidth = 2; - - mymtd = do_map_probe("cfi_probe", &dbox2_flash_map); - } - - if (mymtd) { - mymtd->owner = THIS_MODULE; - - /* Create MTD devices for each partition. */ - mtd_device_register(mymtd, partition_info, NUM_PARTITIONS); - - return 0; - } - - iounmap((void *)dbox2_flash_map.virt); - return -ENXIO; -} - -static void __exit cleanup_dbox2_flash(void) -{ - if (mymtd) { - mtd_device_unregister(mymtd); - map_destroy(mymtd); - } - if (dbox2_flash_map.virt) { - iounmap((void *)dbox2_flash_map.virt); - dbox2_flash_map.virt = 0; - } -} - -module_init(init_dbox2_flash); -module_exit(cleanup_dbox2_flash); - - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Kári Davíðsson , Bastian Blank , Alexander Wild "); -MODULE_DESCRIPTION("MTD map driver for D-Box 2 board"); diff --git a/drivers/mtd/maps/dc21285.c b/drivers/mtd/maps/dc21285.c index 080f06053bd4..f8a7dd14cee0 100644 --- a/drivers/mtd/maps/dc21285.c +++ b/drivers/mtd/maps/dc21285.c @@ -143,9 +143,8 @@ static struct map_info dc21285_map = { .copy_from = dc21285_copy_from, }; - /* Partition stuff */ -static const char *probes[] = { "RedBoot", "cmdlinepart", NULL }; +static const char * const probes[] = { "RedBoot", "cmdlinepart", NULL }; static int __init init_dc21285(void) { diff --git a/drivers/mtd/maps/dilnetpc.c b/drivers/mtd/maps/dilnetpc.c deleted file mode 100644 index 3e393f0da823..000000000000 --- a/drivers/mtd/maps/dilnetpc.c +++ /dev/null @@ -1,496 +0,0 @@ -/* dilnetpc.c -- MTD map driver for SSV DIL/Net PC Boards "DNP" and "ADNP" - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA - * - * The DIL/Net PC is a tiny embedded PC board made by SSV Embedded Systems - * featuring the AMD Elan SC410 processor. There are two variants of this - * board: DNP/1486 and ADNP/1486. The DNP version has 2 megs of flash - * ROM (Intel 28F016S3) and 8 megs of DRAM, the ADNP version has 4 megs - * flash and 16 megs of RAM. - * For details, see http://www.ssv-embedded.de/ssv/pc104/p169.htm - * and http://www.ssv-embedded.de/ssv/pc104/p170.htm - */ - -#include -#include -#include -#include -#include - -#include -#include -#include -#include - -#include - -/* -** The DIL/NetPC keeps its BIOS in two distinct flash blocks. -** Destroying any of these blocks transforms the DNPC into -** a paperweight (albeit not a very useful one, considering -** it only weighs a few grams). -** -** Therefore, the BIOS blocks must never be erased or written to -** except by people who know exactly what they are doing (e.g. -** to install a BIOS update). These partitions are marked read-only -** by default, but can be made read/write by undefining -** DNPC_BIOS_BLOCKS_WRITEPROTECTED: -*/ -#define DNPC_BIOS_BLOCKS_WRITEPROTECTED - -/* -** The ID string (in ROM) is checked to determine whether we -** are running on a DNP/1486 or ADNP/1486 -*/ -#define BIOSID_BASE 0x000fe100 - -#define ID_DNPC "DNP1486" -#define ID_ADNP "ADNP1486" - -/* -** Address where the flash should appear in CPU space -*/ -#define FLASH_BASE 0x2000000 - -/* -** Chip Setup and Control (CSC) indexed register space -*/ -#define CSC_INDEX 0x22 -#define CSC_DATA 0x23 - -#define CSC_MMSWAR 0x30 /* MMS window C-F attributes register */ -#define CSC_MMSWDSR 0x31 /* MMS window C-F device select register */ - -#define CSC_RBWR 0xa7 /* GPIO Read-Back/Write Register B */ - -#define CSC_CR 0xd0 /* internal I/O device disable/Echo */ - /* Z-bus/configuration register */ - -#define CSC_PCCMDCR 0xf1 /* PC card mode and DMA control register */ - - -/* -** PC Card indexed register space: -*/ - -#define PCC_INDEX 0x3e0 -#define PCC_DATA 0x3e1 - -#define PCC_AWER_B 0x46 /* Socket B Address Window enable register */ -#define PCC_MWSAR_1_Lo 0x58 /* memory window 1 start address low register */ -#define PCC_MWSAR_1_Hi 0x59 /* memory window 1 start address high register */ -#define PCC_MWEAR_1_Lo 0x5A /* memory window 1 stop address low register */ -#define PCC_MWEAR_1_Hi 0x5B /* memory window 1 stop address high register */ -#define PCC_MWAOR_1_Lo 0x5C /* memory window 1 address offset low register */ -#define PCC_MWAOR_1_Hi 0x5D /* memory window 1 address offset high register */ - - -/* -** Access to SC4x0's Chip Setup and Control (CSC) -** and PC Card (PCC) indexed registers: -*/ -static inline void setcsc(int reg, unsigned char data) -{ - outb(reg, CSC_INDEX); - outb(data, CSC_DATA); -} - -static inline unsigned char getcsc(int reg) -{ - outb(reg, CSC_INDEX); - return(inb(CSC_DATA)); -} - -static inline void setpcc(int reg, unsigned char data) -{ - outb(reg, PCC_INDEX); - outb(data, PCC_DATA); -} - -static inline unsigned char getpcc(int reg) -{ - outb(reg, PCC_INDEX); - return(inb(PCC_DATA)); -} - - -/* -************************************************************ -** Enable access to DIL/NetPC's flash by mapping it into -** the SC4x0's MMS Window C. -************************************************************ -*/ -static void dnpc_map_flash(unsigned long flash_base, unsigned long flash_size) -{ - unsigned long flash_end = flash_base + flash_size - 1; - - /* - ** enable setup of MMS windows C-F: - */ - /* - enable PC Card indexed register space */ - setcsc(CSC_CR, getcsc(CSC_CR) | 0x2); - /* - set PC Card controller to operate in standard mode */ - setcsc(CSC_PCCMDCR, getcsc(CSC_PCCMDCR) & ~1); - - /* - ** Program base address and end address of window - ** where the flash ROM should appear in CPU address space - */ - setpcc(PCC_MWSAR_1_Lo, (flash_base >> 12) & 0xff); - setpcc(PCC_MWSAR_1_Hi, (flash_base >> 20) & 0x3f); - setpcc(PCC_MWEAR_1_Lo, (flash_end >> 12) & 0xff); - setpcc(PCC_MWEAR_1_Hi, (flash_end >> 20) & 0x3f); - - /* program offset of first flash location to appear in this window (0) */ - setpcc(PCC_MWAOR_1_Lo, ((0 - flash_base) >> 12) & 0xff); - setpcc(PCC_MWAOR_1_Hi, ((0 - flash_base)>> 20) & 0x3f); - - /* set attributes for MMS window C: non-cacheable, write-enabled */ - setcsc(CSC_MMSWAR, getcsc(CSC_MMSWAR) & ~0x11); - - /* select physical device ROMCS0 (i.e. flash) for MMS Window C */ - setcsc(CSC_MMSWDSR, getcsc(CSC_MMSWDSR) & ~0x03); - - /* enable memory window 1 */ - setpcc(PCC_AWER_B, getpcc(PCC_AWER_B) | 0x02); - - /* now disable PC Card indexed register space again */ - setcsc(CSC_CR, getcsc(CSC_CR) & ~0x2); -} - - -/* -************************************************************ -** Disable access to DIL/NetPC's flash by mapping it into -** the SC4x0's MMS Window C. -************************************************************ -*/ -static void dnpc_unmap_flash(void) -{ - /* - enable PC Card indexed register space */ - setcsc(CSC_CR, getcsc(CSC_CR) | 0x2); - - /* disable memory window 1 */ - setpcc(PCC_AWER_B, getpcc(PCC_AWER_B) & ~0x02); - - /* now disable PC Card indexed register space again */ - setcsc(CSC_CR, getcsc(CSC_CR) & ~0x2); -} - - - -/* -************************************************************ -** Enable/Disable VPP to write to flash -************************************************************ -*/ - -static DEFINE_SPINLOCK(dnpc_spin); -static int vpp_counter = 0; -/* -** This is what has to be done for the DNP board .. -*/ -static void dnp_set_vpp(struct map_info *not_used, int on) -{ - spin_lock_irq(&dnpc_spin); - - if (on) - { - if(++vpp_counter == 1) - setcsc(CSC_RBWR, getcsc(CSC_RBWR) & ~0x4); - } - else - { - if(--vpp_counter == 0) - setcsc(CSC_RBWR, getcsc(CSC_RBWR) | 0x4); - else - BUG_ON(vpp_counter < 0); - } - spin_unlock_irq(&dnpc_spin); -} - -/* -** .. and this the ADNP version: -*/ -static void adnp_set_vpp(struct map_info *not_used, int on) -{ - spin_lock_irq(&dnpc_spin); - - if (on) - { - if(++vpp_counter == 1) - setcsc(CSC_RBWR, getcsc(CSC_RBWR) & ~0x8); - } - else - { - if(--vpp_counter == 0) - setcsc(CSC_RBWR, getcsc(CSC_RBWR) | 0x8); - else - BUG_ON(vpp_counter < 0); - } - spin_unlock_irq(&dnpc_spin); -} - - - -#define DNP_WINDOW_SIZE 0x00200000 /* DNP flash size is 2MiB */ -#define ADNP_WINDOW_SIZE 0x00400000 /* ADNP flash size is 4MiB */ -#define WINDOW_ADDR FLASH_BASE - -static struct map_info dnpc_map = { - .name = "ADNP Flash Bank", - .size = ADNP_WINDOW_SIZE, - .bankwidth = 1, - .set_vpp = adnp_set_vpp, - .phys = WINDOW_ADDR -}; - -/* -** The layout of the flash is somewhat "strange": -** -** 1. 960 KiB (15 blocks) : Space for ROM Bootloader and user data -** 2. 64 KiB (1 block) : System BIOS -** 3. 960 KiB (15 blocks) : User Data (DNP model) or -** 3. 3008 KiB (47 blocks) : User Data (ADNP model) -** 4. 64 KiB (1 block) : System BIOS Entry -*/ - -static struct mtd_partition partition_info[]= -{ - { - .name = "ADNP boot", - .offset = 0, - .size = 0xf0000, - }, - { - .name = "ADNP system BIOS", - .offset = MTDPART_OFS_NXTBLK, - .size = 0x10000, -#ifdef DNPC_BIOS_BLOCKS_WRITEPROTECTED - .mask_flags = MTD_WRITEABLE, -#endif - }, - { - .name = "ADNP file system", - .offset = MTDPART_OFS_NXTBLK, - .size = 0x2f0000, - }, - { - .name = "ADNP system BIOS entry", - .offset = MTDPART_OFS_NXTBLK, - .size = MTDPART_SIZ_FULL, -#ifdef DNPC_BIOS_BLOCKS_WRITEPROTECTED - .mask_flags = MTD_WRITEABLE, -#endif - }, -}; - -#define NUM_PARTITIONS ARRAY_SIZE(partition_info) - -static struct mtd_info *mymtd; -static struct mtd_info *lowlvl_parts[NUM_PARTITIONS]; -static struct mtd_info *merged_mtd; - -/* -** "Highlevel" partition info: -** -** Using the MTD concat layer, we can re-arrange partitions to our -** liking: we construct a virtual MTD device by concatenating the -** partitions, specifying the sequence such that the boot block -** is immediately followed by the filesystem block (i.e. the stupid -** system BIOS block is mapped to a different place). When re-partitioning -** this concatenated MTD device, we can set the boot block size to -** an arbitrary (though erase block aligned) value i.e. not one that -** is dictated by the flash's physical layout. We can thus set the -** boot block to be e.g. 64 KB (which is fully sufficient if we want -** to boot an etherboot image) or to -say- 1.5 MB if we want to boot -** a large kernel image. In all cases, the remainder of the flash -** is available as file system space. -*/ - -static struct mtd_partition higlvl_partition_info[]= -{ - { - .name = "ADNP boot block", - .offset = 0, - .size = CONFIG_MTD_DILNETPC_BOOTSIZE, - }, - { - .name = "ADNP file system space", - .offset = MTDPART_OFS_NXTBLK, - .size = ADNP_WINDOW_SIZE-CONFIG_MTD_DILNETPC_BOOTSIZE-0x20000, - }, - { - .name = "ADNP system BIOS + BIOS Entry", - .offset = MTDPART_OFS_NXTBLK, - .size = MTDPART_SIZ_FULL, -#ifdef DNPC_BIOS_BLOCKS_WRITEPROTECTED - .mask_flags = MTD_WRITEABLE, -#endif - }, -}; - -#define NUM_HIGHLVL_PARTITIONS ARRAY_SIZE(higlvl_partition_info) - - -static int dnp_adnp_probe(void) -{ - char *biosid, rc = -1; - - biosid = (char*)ioremap(BIOSID_BASE, 16); - if(biosid) - { - if(!strcmp(biosid, ID_DNPC)) - rc = 1; /* this is a DNPC */ - else if(!strcmp(biosid, ID_ADNP)) - rc = 0; /* this is a ADNPC */ - } - iounmap((void *)biosid); - return(rc); -} - - -static int __init init_dnpc(void) -{ - int is_dnp; - - /* - ** determine hardware (DNP/ADNP/invalid) - */ - if((is_dnp = dnp_adnp_probe()) < 0) - return -ENXIO; - - /* - ** Things are set up for ADNP by default - ** -> modify all that needs to be different for DNP - */ - if(is_dnp) - { /* - ** Adjust window size, select correct set_vpp function. - ** The partitioning scheme is identical on both DNP - ** and ADNP except for the size of the third partition. - */ - int i; - dnpc_map.size = DNP_WINDOW_SIZE; - dnpc_map.set_vpp = dnp_set_vpp; - partition_info[2].size = 0xf0000; - - /* - ** increment all string pointers so the leading 'A' gets skipped, - ** thus turning all occurrences of "ADNP ..." into "DNP ..." - */ - ++dnpc_map.name; - for(i = 0; i < NUM_PARTITIONS; i++) - ++partition_info[i].name; - higlvl_partition_info[1].size = DNP_WINDOW_SIZE - - CONFIG_MTD_DILNETPC_BOOTSIZE - 0x20000; - for(i = 0; i < NUM_HIGHLVL_PARTITIONS; i++) - ++higlvl_partition_info[i].name; - } - - printk(KERN_NOTICE "DIL/Net %s flash: 0x%lx at 0x%llx\n", - is_dnp ? "DNPC" : "ADNP", dnpc_map.size, (unsigned long long)dnpc_map.phys); - - dnpc_map.virt = ioremap_nocache(dnpc_map.phys, dnpc_map.size); - - dnpc_map_flash(dnpc_map.phys, dnpc_map.size); - - if (!dnpc_map.virt) { - printk("Failed to ioremap_nocache\n"); - return -EIO; - } - simple_map_init(&dnpc_map); - - printk("FLASH virtual address: 0x%p\n", dnpc_map.virt); - - mymtd = do_map_probe("jedec_probe", &dnpc_map); - - if (!mymtd) - mymtd = do_map_probe("cfi_probe", &dnpc_map); - - /* - ** If flash probes fail, try to make flashes accessible - ** at least as ROM. Ajust erasesize in this case since - ** the default one (128M) will break our partitioning - */ - if (!mymtd) - if((mymtd = do_map_probe("map_rom", &dnpc_map))) - mymtd->erasesize = 0x10000; - - if (!mymtd) { - iounmap(dnpc_map.virt); - return -ENXIO; - } - - mymtd->owner = THIS_MODULE; - - /* - ** Supply pointers to lowlvl_parts[] array to add_mtd_partitions() - ** -> add_mtd_partitions() will _not_ register MTD devices for - ** the partitions, but will instead store pointers to the MTD - ** objects it creates into our lowlvl_parts[] array. - ** NOTE: we arrange the pointers such that the sequence of the - ** partitions gets re-arranged: partition #2 follows - ** partition #0. - */ - partition_info[0].mtdp = &lowlvl_parts[0]; - partition_info[1].mtdp = &lowlvl_parts[2]; - partition_info[2].mtdp = &lowlvl_parts[1]; - partition_info[3].mtdp = &lowlvl_parts[3]; - - mtd_device_register(mymtd, partition_info, NUM_PARTITIONS); - - /* - ** now create a virtual MTD device by concatenating the for partitions - ** (in the sequence given by the lowlvl_parts[] array. - */ - merged_mtd = mtd_concat_create(lowlvl_parts, NUM_PARTITIONS, "(A)DNP Flash Concatenated"); - if(merged_mtd) - { /* - ** now partition the new device the way we want it. This time, - ** we do not supply mtd pointers in higlvl_partition_info, so - ** add_mtd_partitions() will register the devices. - */ - mtd_device_register(merged_mtd, higlvl_partition_info, - NUM_HIGHLVL_PARTITIONS); - } - - return 0; -} - -static void __exit cleanup_dnpc(void) -{ - if(merged_mtd) { - mtd_device_unregister(merged_mtd); - mtd_concat_destroy(merged_mtd); - } - - if (mymtd) { - mtd_device_unregister(mymtd); - map_destroy(mymtd); - } - if (dnpc_map.virt) { - iounmap(dnpc_map.virt); - dnpc_unmap_flash(); - dnpc_map.virt = NULL; - } -} - -module_init(init_dnpc); -module_exit(cleanup_dnpc); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Sysgo Real-Time Solutions GmbH"); -MODULE_DESCRIPTION("MTD map driver for SSV DIL/NetPC DNP & ADNP"); diff --git a/drivers/mtd/maps/dmv182.c b/drivers/mtd/maps/dmv182.c deleted file mode 100644 index 6538ac675e00..000000000000 --- a/drivers/mtd/maps/dmv182.c +++ /dev/null @@ -1,146 +0,0 @@ - -/* - * drivers/mtd/maps/dmv182.c - * - * Flash map driver for the Dy4 SVME182 board - * - * Copyright 2003-2004, TimeSys Corporation - * - * Based on the SVME181 flash map, by Tom Nelson, Dot4, Inc. for TimeSys Corp. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the - * Free Software Foundation; either version 2 of the License, or (at your - * option) any later version. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* - * This driver currently handles only the 16MiB user flash bank 1 on the - * board. It does not provide access to bank 0 (contains the Dy4 FFW), bank 2 - * (VxWorks boot), or the optional 48MiB expansion flash. - * - * scott.wood@timesys.com: On the newer boards with 128MiB flash, it - * now supports the first 96MiB (the boot flash bank containing FFW - * is excluded). The VxWorks loader is in partition 1. - */ - -#define FLASH_BASE_ADDR 0xf0000000 -#define FLASH_BANK_SIZE (128*1024*1024) - -MODULE_AUTHOR("Scott Wood, TimeSys Corporation "); -MODULE_DESCRIPTION("User-programmable flash device on the Dy4 SVME182 board"); -MODULE_LICENSE("GPL"); - -static struct map_info svme182_map = { - .name = "Dy4 SVME182", - .bankwidth = 32, - .size = 128 * 1024 * 1024 -}; - -#define BOOTIMAGE_PART_SIZE ((6*1024*1024)-RESERVED_PART_SIZE) - -// Allow 6MiB for the kernel -#define NEW_BOOTIMAGE_PART_SIZE (6 * 1024 * 1024) -// Allow 1MiB for the bootloader -#define NEW_BOOTLOADER_PART_SIZE (1024 * 1024) -// Use the remaining 9MiB at the end of flash for the RFS -#define NEW_RFS_PART_SIZE (0x01000000 - NEW_BOOTLOADER_PART_SIZE - \ - NEW_BOOTIMAGE_PART_SIZE) - -static struct mtd_partition svme182_partitions[] = { - // The Lower PABS is only 128KiB, but the partition code doesn't - // like partitions that don't end on the largest erase block - // size of the device, even if all of the erase blocks in the - // partition are small ones. The hardware should prevent - // writes to the actual PABS areas. - { - name: "Lower PABS and CPU 0 bootloader or kernel", - size: 6*1024*1024, - offset: 0, - }, - { - name: "Root Filesystem", - size: 10*1024*1024, - offset: MTDPART_OFS_NXTBLK - }, - { - name: "CPU1 Bootloader", - size: 1024*1024, - offset: MTDPART_OFS_NXTBLK, - }, - { - name: "Extra", - size: 110*1024*1024, - offset: MTDPART_OFS_NXTBLK - }, - { - name: "Foundation Firmware and Upper PABS", - size: 1024*1024, - offset: MTDPART_OFS_NXTBLK, - mask_flags: MTD_WRITEABLE // read-only - } -}; - -static struct mtd_info *this_mtd; - -static int __init init_svme182(void) -{ - struct mtd_partition *partitions; - int num_parts = ARRAY_SIZE(svme182_partitions); - - partitions = svme182_partitions; - - svme182_map.virt = ioremap(FLASH_BASE_ADDR, svme182_map.size); - - if (svme182_map.virt == 0) { - printk("Failed to ioremap FLASH memory area.\n"); - return -EIO; - } - - simple_map_init(&svme182_map); - - this_mtd = do_map_probe("cfi_probe", &svme182_map); - if (!this_mtd) - { - iounmap((void *)svme182_map.virt); - return -ENXIO; - } - - printk(KERN_NOTICE "SVME182 flash device: %dMiB at 0x%08x\n", - this_mtd->size >> 20, FLASH_BASE_ADDR); - - this_mtd->owner = THIS_MODULE; - mtd_device_register(this_mtd, partitions, num_parts); - - return 0; -} - -static void __exit cleanup_svme182(void) -{ - if (this_mtd) - { - mtd_device_unregister(this_mtd); - map_destroy(this_mtd); - } - - if (svme182_map.virt) - { - iounmap((void *)svme182_map.virt); - svme182_map.virt = 0; - } - - return; -} - -module_init(init_svme182); -module_exit(cleanup_svme182); diff --git a/drivers/mtd/maps/gpio-addr-flash.c b/drivers/mtd/maps/gpio-addr-flash.c index 7b643de2500b..5ede28294f9e 100644 --- a/drivers/mtd/maps/gpio-addr-flash.c +++ b/drivers/mtd/maps/gpio-addr-flash.c @@ -157,7 +157,8 @@ static void gf_copy_to(struct map_info *map, unsigned long to, memcpy_toio(map->virt + (to % state->win_size), from, len); } -static const char *part_probe_types[] = { "cmdlinepart", "RedBoot", NULL }; +static const char * const part_probe_types[] = { + "cmdlinepart", "RedBoot", NULL }; /** * gpio_flash_probe() - setup a mapping for a GPIO assisted flash diff --git a/drivers/mtd/maps/h720x-flash.c b/drivers/mtd/maps/h720x-flash.c deleted file mode 100644 index 8ed6cb4529d8..000000000000 --- a/drivers/mtd/maps/h720x-flash.c +++ /dev/null @@ -1,120 +0,0 @@ -/* - * Flash memory access on Hynix GMS30C7201/HMS30C7202 based - * evaluation boards - * - * (C) 2002 Jungjun Kim - * 2003 Thomas Gleixner - */ - -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include - -static struct mtd_info *mymtd; - -static struct map_info h720x_map = { - .name = "H720X", - .bankwidth = 4, - .size = H720X_FLASH_SIZE, - .phys = H720X_FLASH_PHYS, -}; - -static struct mtd_partition h720x_partitions[] = { - { - .name = "ArMon", - .size = 0x00080000, - .offset = 0, - .mask_flags = MTD_WRITEABLE - },{ - .name = "Env", - .size = 0x00040000, - .offset = 0x00080000, - .mask_flags = MTD_WRITEABLE - },{ - .name = "Kernel", - .size = 0x00180000, - .offset = 0x000c0000, - .mask_flags = MTD_WRITEABLE - },{ - .name = "Ramdisk", - .size = 0x00400000, - .offset = 0x00240000, - .mask_flags = MTD_WRITEABLE - },{ - .name = "jffs2", - .size = MTDPART_SIZ_FULL, - .offset = MTDPART_OFS_APPEND - } -}; - -#define NUM_PARTITIONS ARRAY_SIZE(h720x_partitions) - -/* - * Initialize FLASH support - */ -static int __init h720x_mtd_init(void) -{ - h720x_map.virt = ioremap(h720x_map.phys, h720x_map.size); - - if (!h720x_map.virt) { - printk(KERN_ERR "H720x-MTD: ioremap failed\n"); - return -EIO; - } - - simple_map_init(&h720x_map); - - // Probe for flash bankwidth 4 - printk (KERN_INFO "H720x-MTD probing 32bit FLASH\n"); - mymtd = do_map_probe("cfi_probe", &h720x_map); - if (!mymtd) { - printk (KERN_INFO "H720x-MTD probing 16bit FLASH\n"); - // Probe for bankwidth 2 - h720x_map.bankwidth = 2; - mymtd = do_map_probe("cfi_probe", &h720x_map); - } - - if (mymtd) { - mymtd->owner = THIS_MODULE; - - mtd_device_parse_register(mymtd, NULL, NULL, - h720x_partitions, NUM_PARTITIONS); - return 0; - } - - iounmap((void *)h720x_map.virt); - return -ENXIO; -} - -/* - * Cleanup - */ -static void __exit h720x_mtd_cleanup(void) -{ - - if (mymtd) { - mtd_device_unregister(mymtd); - map_destroy(mymtd); - } - - if (h720x_map.virt) { - iounmap((void *)h720x_map.virt); - h720x_map.virt = 0; - } -} - - -module_init(h720x_mtd_init); -module_exit(h720x_mtd_cleanup); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Thomas Gleixner "); -MODULE_DESCRIPTION("MTD map driver for Hynix evaluation boards"); diff --git a/drivers/mtd/maps/impa7.c b/drivers/mtd/maps/impa7.c index 834a06c56f56..49686744d93c 100644 --- a/drivers/mtd/maps/impa7.c +++ b/drivers/mtd/maps/impa7.c @@ -24,14 +24,12 @@ #define NUM_FLASHBANKS 2 #define BUSWIDTH 4 -/* can be { "cfi_probe", "jedec_probe", "map_rom", NULL } */ -#define PROBETYPES { "jedec_probe", NULL } - #define MSG_PREFIX "impA7:" /* prefix for our printk()'s */ #define MTDID "impa7-%d" /* for mtdparts= partitioning */ static struct mtd_info *impa7_mtd[NUM_FLASHBANKS]; +static const char * const rom_probe_types[] = { "jedec_probe", NULL }; static struct map_info impa7_map[NUM_FLASHBANKS] = { { @@ -60,8 +58,7 @@ static struct mtd_partition partitions[] = static int __init init_impa7(void) { - static const char *rom_probe_types[] = PROBETYPES; - const char **type; + const char * const *type; int i; static struct { u_long addr; u_long size; } pt[NUM_FLASHBANKS] = { { WINDOW_ADDR0, WINDOW_SIZE0 }, diff --git a/drivers/mtd/maps/intel_vr_nor.c b/drivers/mtd/maps/intel_vr_nor.c index b14053b25026..f581ac1cf022 100644 --- a/drivers/mtd/maps/intel_vr_nor.c +++ b/drivers/mtd/maps/intel_vr_nor.c @@ -82,9 +82,9 @@ static void vr_nor_destroy_mtd_setup(struct vr_nor_mtd *p) static int vr_nor_mtd_setup(struct vr_nor_mtd *p) { - static const char *probe_types[] = + static const char * const probe_types[] = { "cfi_probe", "jedec_probe", NULL }; - const char **type; + const char * const *type; for (type = probe_types; !p->info && *type; type++) p->info = do_map_probe(*type, &p->map); diff --git a/drivers/mtd/maps/ixp2000.c b/drivers/mtd/maps/ixp2000.c deleted file mode 100644 index 4a41ced0f710..000000000000 --- a/drivers/mtd/maps/ixp2000.c +++ /dev/null @@ -1,253 +0,0 @@ -/* - * drivers/mtd/maps/ixp2000.c - * - * Mapping for the Intel XScale IXP2000 based systems - * - * Copyright (C) 2002 Intel Corp. - * Copyright (C) 2003-2004 MontaVista Software, Inc. - * - * Original Author: Naeem M Afzal - * Maintainer: Deepak Saxena - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include - -#include -#include -#include - -#include - -struct ixp2000_flash_info { - struct mtd_info *mtd; - struct map_info map; - struct resource *res; -}; - -static inline unsigned long flash_bank_setup(struct map_info *map, unsigned long ofs) -{ - unsigned long (*set_bank)(unsigned long) = - (unsigned long(*)(unsigned long))map->map_priv_2; - - return (set_bank ? set_bank(ofs) : ofs); -} - -#ifdef __ARMEB__ -/* - * Rev A0 and A1 of IXP2400 silicon have a broken addressing unit which - * causes the lower address bits to be XORed with 0x11 on 8 bit accesses - * and XORed with 0x10 on 16 bit accesses. See the spec update, erratum 44. - */ -static int erratum44_workaround = 0; - -static inline unsigned long address_fix8_write(unsigned long addr) -{ - if (erratum44_workaround) { - return (addr ^ 3); - } - return addr; -} -#else - -#define address_fix8_write(x) (x) -#endif - -static map_word ixp2000_flash_read8(struct map_info *map, unsigned long ofs) -{ - map_word val; - - val.x[0] = *((u8 *)(map->map_priv_1 + flash_bank_setup(map, ofs))); - return val; -} - -/* - * We can't use the standard memcpy due to the broken SlowPort - * address translation on rev A0 and A1 silicon and the fact that - * we have banked flash. - */ -static void ixp2000_flash_copy_from(struct map_info *map, void *to, - unsigned long from, ssize_t len) -{ - from = flash_bank_setup(map, from); - while(len--) - *(__u8 *) to++ = *(__u8 *)(map->map_priv_1 + from++); -} - -static void ixp2000_flash_write8(struct map_info *map, map_word d, unsigned long ofs) -{ - *(__u8 *) (address_fix8_write(map->map_priv_1 + - flash_bank_setup(map, ofs))) = d.x[0]; -} - -static void ixp2000_flash_copy_to(struct map_info *map, unsigned long to, - const void *from, ssize_t len) -{ - to = flash_bank_setup(map, to); - while(len--) { - unsigned long tmp = address_fix8_write(map->map_priv_1 + to++); - *(__u8 *)(tmp) = *(__u8 *)(from++); - } -} - - -static int ixp2000_flash_remove(struct platform_device *dev) -{ - struct flash_platform_data *plat = dev->dev.platform_data; - struct ixp2000_flash_info *info = platform_get_drvdata(dev); - - platform_set_drvdata(dev, NULL); - - if(!info) - return 0; - - if (info->mtd) { - mtd_device_unregister(info->mtd); - map_destroy(info->mtd); - } - if (info->map.map_priv_1) - iounmap((void *) info->map.map_priv_1); - - if (info->res) { - release_resource(info->res); - kfree(info->res); - } - - if (plat->exit) - plat->exit(); - - return 0; -} - - -static int ixp2000_flash_probe(struct platform_device *dev) -{ - static const char *probes[] = { "RedBoot", "cmdlinepart", NULL }; - struct ixp2000_flash_data *ixp_data = dev->dev.platform_data; - struct flash_platform_data *plat; - struct ixp2000_flash_info *info; - unsigned long window_size; - int err = -1; - - if (!ixp_data) - return -ENODEV; - - plat = ixp_data->platform_data; - if (!plat) - return -ENODEV; - - window_size = resource_size(dev->resource); - dev_info(&dev->dev, "Probe of IXP2000 flash(%d banks x %dMiB)\n", - ixp_data->nr_banks, ((u32)window_size >> 20)); - - if (plat->width != 1) { - dev_err(&dev->dev, "IXP2000 MTD map only supports 8-bit mode, asking for %d\n", - plat->width * 8); - return -EIO; - } - - info = kzalloc(sizeof(struct ixp2000_flash_info), GFP_KERNEL); - if(!info) { - err = -ENOMEM; - goto Error; - } - - platform_set_drvdata(dev, info); - - /* - * Tell the MTD layer we're not 1:1 mapped so that it does - * not attempt to do a direct access on us. - */ - info->map.phys = NO_XIP; - - info->map.size = ixp_data->nr_banks * window_size; - info->map.bankwidth = 1; - - /* - * map_priv_2 is used to store a ptr to the bank_setup routine - */ - info->map.map_priv_2 = (unsigned long) ixp_data->bank_setup; - - info->map.name = dev_name(&dev->dev); - info->map.read = ixp2000_flash_read8; - info->map.write = ixp2000_flash_write8; - info->map.copy_from = ixp2000_flash_copy_from; - info->map.copy_to = ixp2000_flash_copy_to; - - info->res = request_mem_region(dev->resource->start, - resource_size(dev->resource), - dev_name(&dev->dev)); - if (!info->res) { - dev_err(&dev->dev, "Could not reserve memory region\n"); - err = -ENOMEM; - goto Error; - } - - info->map.map_priv_1 = - (unsigned long)ioremap(dev->resource->start, - resource_size(dev->resource)); - if (!info->map.map_priv_1) { - dev_err(&dev->dev, "Failed to ioremap flash region\n"); - err = -EIO; - goto Error; - } - -#if defined(__ARMEB__) - /* - * Enable erratum 44 workaround for NPUs with broken slowport - */ - - erratum44_workaround = ixp2000_has_broken_slowport(); - dev_info(&dev->dev, "Erratum 44 workaround %s\n", - erratum44_workaround ? "enabled" : "disabled"); -#endif - - info->mtd = do_map_probe(plat->map_name, &info->map); - if (!info->mtd) { - dev_err(&dev->dev, "map_probe failed\n"); - err = -ENXIO; - goto Error; - } - info->mtd->owner = THIS_MODULE; - - err = mtd_device_parse_register(info->mtd, probes, NULL, NULL, 0); - if (err) - goto Error; - - return 0; - -Error: - ixp2000_flash_remove(dev); - return err; -} - -static struct platform_driver ixp2000_flash_driver = { - .probe = ixp2000_flash_probe, - .remove = ixp2000_flash_remove, - .driver = { - .name = "IXP2000-Flash", - .owner = THIS_MODULE, - }, -}; - -module_platform_driver(ixp2000_flash_driver); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Deepak Saxena "); -MODULE_ALIAS("platform:IXP2000-Flash"); diff --git a/drivers/mtd/maps/ixp4xx.c b/drivers/mtd/maps/ixp4xx.c index e864fc6c58f9..52b3410a105c 100644 --- a/drivers/mtd/maps/ixp4xx.c +++ b/drivers/mtd/maps/ixp4xx.c @@ -148,7 +148,7 @@ struct ixp4xx_flash_info { struct resource *res; }; -static const char *probes[] = { "RedBoot", "cmdlinepart", NULL }; +static const char * const probes[] = { "RedBoot", "cmdlinepart", NULL }; static int ixp4xx_flash_remove(struct platform_device *dev) { diff --git a/drivers/mtd/maps/lantiq-flash.c b/drivers/mtd/maps/lantiq-flash.c index d1da6ede3845..d7ac65d1d569 100644 --- a/drivers/mtd/maps/lantiq-flash.c +++ b/drivers/mtd/maps/lantiq-flash.c @@ -46,8 +46,7 @@ struct ltq_mtd { }; static const char ltq_map_name[] = "ltq_nor"; -static const char *ltq_probe_types[] = { - "cmdlinepart", "ofpart", NULL }; +static const char * const ltq_probe_types[] = { "cmdlinepart", "ofpart", NULL }; static map_word ltq_read16(struct map_info *map, unsigned long adr) diff --git a/drivers/mtd/maps/mbx860.c b/drivers/mtd/maps/mbx860.c deleted file mode 100644 index 93fa56c33003..000000000000 --- a/drivers/mtd/maps/mbx860.c +++ /dev/null @@ -1,98 +0,0 @@ -/* - * Handle mapping of the flash on MBX860 boards - * - * Author: Anton Todorov - * Copyright: (C) 2001 Emness Technology - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - */ - -#include -#include -#include -#include -#include -#include -#include -#include - - -#define WINDOW_ADDR 0xfe000000 -#define WINDOW_SIZE 0x00200000 - -/* Flash / Partition sizing */ -#define MAX_SIZE_KiB 8192 -#define BOOT_PARTITION_SIZE_KiB 512 -#define KERNEL_PARTITION_SIZE_KiB 5632 -#define APP_PARTITION_SIZE_KiB 2048 - -#define NUM_PARTITIONS 3 - -/* partition_info gives details on the logical partitions that the split the - * single flash device into. If the size if zero we use up to the end of the - * device. */ -static struct mtd_partition partition_info[]={ - { .name = "MBX flash BOOT partition", - .offset = 0, - .size = BOOT_PARTITION_SIZE_KiB*1024 }, - { .name = "MBX flash DATA partition", - .offset = BOOT_PARTITION_SIZE_KiB*1024, - .size = (KERNEL_PARTITION_SIZE_KiB)*1024 }, - { .name = "MBX flash APPLICATION partition", - .offset = (BOOT_PARTITION_SIZE_KiB+KERNEL_PARTITION_SIZE_KiB)*1024 } -}; - - -static struct mtd_info *mymtd; - -struct map_info mbx_map = { - .name = "MBX flash", - .size = WINDOW_SIZE, - .phys = WINDOW_ADDR, - .bankwidth = 4, -}; - -static int __init init_mbx(void) -{ - printk(KERN_NOTICE "Motorola MBX flash device: 0x%x at 0x%x\n", WINDOW_SIZE*4, WINDOW_ADDR); - mbx_map.virt = ioremap(WINDOW_ADDR, WINDOW_SIZE * 4); - - if (!mbx_map.virt) { - printk("Failed to ioremap\n"); - return -EIO; - } - simple_map_init(&mbx_map); - - mymtd = do_map_probe("jedec_probe", &mbx_map); - if (mymtd) { - mymtd->owner = THIS_MODULE; - mtd_device_register(mymtd, NULL, 0); - mtd_device_register(mymtd, partition_info, NUM_PARTITIONS); - return 0; - } - - iounmap((void *)mbx_map.virt); - return -ENXIO; -} - -static void __exit cleanup_mbx(void) -{ - if (mymtd) { - mtd_device_unregister(mymtd); - map_destroy(mymtd); - } - if (mbx_map.virt) { - iounmap((void *)mbx_map.virt); - mbx_map.virt = 0; - } -} - -module_init(init_mbx); -module_exit(cleanup_mbx); - -MODULE_AUTHOR("Anton Todorov "); -MODULE_DESCRIPTION("MTD map driver for Motorola MBX860 board"); -MODULE_LICENSE("GPL"); diff --git a/drivers/mtd/maps/pci.c b/drivers/mtd/maps/pci.c index c3aebd5da5d6..c2604f8b2a5e 100644 --- a/drivers/mtd/maps/pci.c +++ b/drivers/mtd/maps/pci.c @@ -283,8 +283,7 @@ static int mtd_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) if (err) goto release; - /* tsk - do_map_probe should take const char * */ - mtd = do_map_probe((char *)info->map_name, &map->map); + mtd = do_map_probe(info->map_name, &map->map); err = -ENODEV; if (!mtd) goto release; diff --git a/drivers/mtd/maps/physmap.c b/drivers/mtd/maps/physmap.c index 21b0b713cacb..e7a592c8c765 100644 --- a/drivers/mtd/maps/physmap.c +++ b/drivers/mtd/maps/physmap.c @@ -87,21 +87,18 @@ static void physmap_set_vpp(struct map_info *map, int state) spin_unlock_irqrestore(&info->vpp_lock, flags); } -static const char *rom_probe_types[] = { - "cfi_probe", - "jedec_probe", - "qinfo_probe", - "map_rom", - NULL }; -static const char *part_probe_types[] = { "cmdlinepart", "RedBoot", "afs", - NULL }; +static const char * const rom_probe_types[] = { + "cfi_probe", "jedec_probe", "qinfo_probe", "map_rom", NULL }; + +static const char * const part_probe_types[] = { + "cmdlinepart", "RedBoot", "afs", NULL }; static int physmap_flash_probe(struct platform_device *dev) { struct physmap_flash_data *physmap_data; struct physmap_flash_info *info; - const char **probe_type; - const char **part_types; + const char * const *probe_type; + const char * const *part_types; int err = 0; int i; int devices_found = 0; diff --git a/drivers/mtd/maps/physmap_of.c b/drivers/mtd/maps/physmap_of.c index 363939dfad05..d11109762ac5 100644 --- a/drivers/mtd/maps/physmap_of.c +++ b/drivers/mtd/maps/physmap_of.c @@ -71,6 +71,9 @@ static int of_flash_remove(struct platform_device *dev) return 0; } +static const char * const rom_probe_types[] = { + "cfi_probe", "jedec_probe", "map_rom" }; + /* Helper function to handle probing of the obsolete "direct-mapped" * compatible binding, which has an extra "probe-type" property * describing the type of flash probe necessary. */ @@ -80,8 +83,6 @@ static struct mtd_info *obsolete_probe(struct platform_device *dev, struct device_node *dp = dev->dev.of_node; const char *of_probe; struct mtd_info *mtd; - static const char *rom_probe_types[] - = { "cfi_probe", "jedec_probe", "map_rom"}; int i; dev_warn(&dev->dev, "Device tree uses obsolete \"direct-mapped\" " @@ -111,9 +112,10 @@ static struct mtd_info *obsolete_probe(struct platform_device *dev, specifies the list of partition probers to use. If none is given then the default is use. These take precedence over other device tree information. */ -static const char *part_probe_types_def[] = { "cmdlinepart", "RedBoot", - "ofpart", "ofoldpart", NULL }; -static const char **of_get_probes(struct device_node *dp) +static const char * const part_probe_types_def[] = { + "cmdlinepart", "RedBoot", "ofpart", "ofoldpart", NULL }; + +static const char * const *of_get_probes(struct device_node *dp) { const char *cp; int cplen; @@ -142,7 +144,7 @@ static const char **of_get_probes(struct device_node *dp) return res; } -static void of_free_probes(const char **probes) +static void of_free_probes(const char * const *probes) { if (probes != part_probe_types_def) kfree(probes); @@ -151,7 +153,7 @@ static void of_free_probes(const char **probes) static struct of_device_id of_flash_match[]; static int of_flash_probe(struct platform_device *dev) { - const char **part_probe_types; + const char * const *part_probe_types; const struct of_device_id *match; struct device_node *dp = dev->dev.of_node; struct resource res; diff --git a/drivers/mtd/maps/plat-ram.c b/drivers/mtd/maps/plat-ram.c index 2de66b062f0d..71fdda29594b 100644 --- a/drivers/mtd/maps/plat-ram.c +++ b/drivers/mtd/maps/plat-ram.c @@ -199,7 +199,7 @@ static int platram_probe(struct platform_device *pdev) * supplied by the platform_data struct */ if (pdata->map_probes) { - const char **map_probes = pdata->map_probes; + const char * const *map_probes = pdata->map_probes; for ( ; !info->mtd && *map_probes; map_probes++) info->mtd = do_map_probe(*map_probes , &info->map); diff --git a/drivers/mtd/maps/pxa2xx-flash.c b/drivers/mtd/maps/pxa2xx-flash.c index 43e3dbb976d9..acb1dbcf7ce5 100644 --- a/drivers/mtd/maps/pxa2xx-flash.c +++ b/drivers/mtd/maps/pxa2xx-flash.c @@ -45,9 +45,7 @@ struct pxa2xx_flash_info { struct map_info map; }; - -static const char *probes[] = { "RedBoot", "cmdlinepart", NULL }; - +static const char * const probes[] = { "RedBoot", "cmdlinepart", NULL }; static int pxa2xx_flash_probe(struct platform_device *pdev) { diff --git a/drivers/mtd/maps/rbtx4939-flash.c b/drivers/mtd/maps/rbtx4939-flash.c index 49c3fe715eee..ac02fbffd6df 100644 --- a/drivers/mtd/maps/rbtx4939-flash.c +++ b/drivers/mtd/maps/rbtx4939-flash.c @@ -45,14 +45,15 @@ static int rbtx4939_flash_remove(struct platform_device *dev) return 0; } -static const char *rom_probe_types[] = { "cfi_probe", "jedec_probe", NULL }; +static const char * const rom_probe_types[] = { + "cfi_probe", "jedec_probe", NULL }; static int rbtx4939_flash_probe(struct platform_device *dev) { struct rbtx4939_flash_data *pdata; struct rbtx4939_flash_info *info; struct resource *res; - const char **probe_type; + const char * const *probe_type; int err = 0; unsigned long size; diff --git a/drivers/mtd/maps/rpxlite.c b/drivers/mtd/maps/rpxlite.c deleted file mode 100644 index ed88225bf667..000000000000 --- a/drivers/mtd/maps/rpxlite.c +++ /dev/null @@ -1,64 +0,0 @@ -/* - * Handle mapping of the flash on the RPX Lite and CLLF boards - */ - -#include -#include -#include -#include -#include -#include -#include - - -#define WINDOW_ADDR 0xfe000000 -#define WINDOW_SIZE 0x800000 - -static struct mtd_info *mymtd; - -static struct map_info rpxlite_map = { - .name = "RPX", - .size = WINDOW_SIZE, - .bankwidth = 4, - .phys = WINDOW_ADDR, -}; - -static int __init init_rpxlite(void) -{ - printk(KERN_NOTICE "RPX Lite or CLLF flash device: %x at %x\n", WINDOW_SIZE*4, WINDOW_ADDR); - rpxlite_map.virt = ioremap(WINDOW_ADDR, WINDOW_SIZE * 4); - - if (!rpxlite_map.virt) { - printk("Failed to ioremap\n"); - return -EIO; - } - simple_map_init(&rpxlite_map); - mymtd = do_map_probe("cfi_probe", &rpxlite_map); - if (mymtd) { - mymtd->owner = THIS_MODULE; - mtd_device_register(mymtd, NULL, 0); - return 0; - } - - iounmap((void *)rpxlite_map.virt); - return -ENXIO; -} - -static void __exit cleanup_rpxlite(void) -{ - if (mymtd) { - mtd_device_unregister(mymtd); - map_destroy(mymtd); - } - if (rpxlite_map.virt) { - iounmap((void *)rpxlite_map.virt); - rpxlite_map.virt = 0; - } -} - -module_init(init_rpxlite); -module_exit(cleanup_rpxlite); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Arnold Christensen "); -MODULE_DESCRIPTION("MTD map driver for RPX Lite and CLLF boards"); diff --git a/drivers/mtd/maps/sa1100-flash.c b/drivers/mtd/maps/sa1100-flash.c index f694417cf7e6..29e3dcaa1d90 100644 --- a/drivers/mtd/maps/sa1100-flash.c +++ b/drivers/mtd/maps/sa1100-flash.c @@ -244,7 +244,7 @@ static struct sa_info *sa1100_setup_mtd(struct platform_device *pdev, return ERR_PTR(ret); } -static const char *part_probes[] = { "cmdlinepart", "RedBoot", NULL }; +static const char * const part_probes[] = { "cmdlinepart", "RedBoot", NULL }; static int sa1100_mtd_probe(struct platform_device *pdev) { diff --git a/drivers/mtd/maps/solutionengine.c b/drivers/mtd/maps/solutionengine.c index 9d900ada6708..83a7a7091562 100644 --- a/drivers/mtd/maps/solutionengine.c +++ b/drivers/mtd/maps/solutionengine.c @@ -31,7 +31,7 @@ struct map_info soleng_flash_map = { .bankwidth = 4, }; -static const char *probes[] = { "RedBoot", "cmdlinepart", NULL }; +static const char * const probes[] = { "RedBoot", "cmdlinepart", NULL }; #ifdef CONFIG_MTD_SUPERH_RESERVE static struct mtd_partition superh_se_partitions[] = { diff --git a/drivers/mtd/maps/tqm8xxl.c b/drivers/mtd/maps/tqm8xxl.c deleted file mode 100644 index d78587990e7e..000000000000 --- a/drivers/mtd/maps/tqm8xxl.c +++ /dev/null @@ -1,249 +0,0 @@ -/* - * Handle mapping of the flash memory access routines - * on TQM8xxL based devices. - * - * based on rpxlite.c - * - * Copyright(C) 2001 Kirk Lee - * - * This code is GPLed - * - */ - -/* - * According to TQM8xxL hardware manual, TQM8xxL series have - * following flash memory organisations: - * | capacity | | chip type | | bank0 | | bank1 | - * 2MiB 512Kx16 2MiB 0 - * 4MiB 1Mx16 4MiB 0 - * 8MiB 1Mx16 4MiB 4MiB - * Thus, we choose CONFIG_MTD_CFI_I2 & CONFIG_MTD_CFI_B4 at - * kernel configuration. - */ -#include -#include -#include -#include -#include - -#include -#include -#include - -#include - -#define FLASH_ADDR 0x40000000 -#define FLASH_SIZE 0x00800000 -#define FLASH_BANK_MAX 4 - -// trivial struct to describe partition information -struct mtd_part_def -{ - int nums; - unsigned char *type; - struct mtd_partition* mtd_part; -}; - -//static struct mtd_info *mymtd; -static struct mtd_info* mtd_banks[FLASH_BANK_MAX]; -static struct map_info* map_banks[FLASH_BANK_MAX]; -static struct mtd_part_def part_banks[FLASH_BANK_MAX]; -static unsigned long num_banks; -static void __iomem *start_scan_addr; - -/* - * Here are partition information for all known TQM8xxL series devices. - * See include/linux/mtd/partitions.h for definition of the mtd_partition - * structure. - * - * The *_max_flash_size is the maximum possible mapped flash size which - * is not necessarily the actual flash size. It must correspond to the - * value specified in the mapping definition defined by the - * "struct map_desc *_io_desc" for the corresponding machine. - */ - -/* Currently, TQM8xxL has up to 8MiB flash */ -static unsigned long tqm8xxl_max_flash_size = 0x00800000; - -/* partition definition for first flash bank - * (cf. "drivers/char/flash_config.c") - */ -static struct mtd_partition tqm8xxl_partitions[] = { - { - .name = "ppcboot", - .offset = 0x00000000, - .size = 0x00020000, /* 128KB */ - .mask_flags = MTD_WRITEABLE, /* force read-only */ - }, - { - .name = "kernel", /* default kernel image */ - .offset = 0x00020000, - .size = 0x000e0000, - .mask_flags = MTD_WRITEABLE, /* force read-only */ - }, - { - .name = "user", - .offset = 0x00100000, - .size = 0x00100000, - }, - { - .name = "initrd", - .offset = 0x00200000, - .size = 0x00200000, - } -}; -/* partition definition for second flash bank */ -static struct mtd_partition tqm8xxl_fs_partitions[] = { - { - .name = "cramfs", - .offset = 0x00000000, - .size = 0x00200000, - }, - { - .name = "jffs", - .offset = 0x00200000, - .size = 0x00200000, - //.size = MTDPART_SIZ_FULL, - } -}; - -static int __init init_tqm_mtd(void) -{ - int idx = 0, ret = 0; - unsigned long flash_addr, flash_size, mtd_size = 0; - /* pointer to TQM8xxL board info data */ - bd_t *bd = (bd_t *)__res; - - flash_addr = bd->bi_flashstart; - flash_size = bd->bi_flashsize; - - //request maximum flash size address space - start_scan_addr = ioremap(flash_addr, flash_size); - if (!start_scan_addr) { - printk(KERN_WARNING "%s:Failed to ioremap address:0x%x\n", __func__, flash_addr); - return -EIO; - } - - for (idx = 0 ; idx < FLASH_BANK_MAX ; idx++) { - if(mtd_size >= flash_size) - break; - - printk(KERN_INFO "%s: chip probing count %d\n", __func__, idx); - - map_banks[idx] = kzalloc(sizeof(struct map_info), GFP_KERNEL); - if(map_banks[idx] == NULL) { - ret = -ENOMEM; - /* FIXME: What if some MTD devices were probed already? */ - goto error_mem; - } - - map_banks[idx]->name = kmalloc(16, GFP_KERNEL); - - if (!map_banks[idx]->name) { - ret = -ENOMEM; - /* FIXME: What if some MTD devices were probed already? */ - goto error_mem; - } - sprintf(map_banks[idx]->name, "TQM8xxL%d", idx); - - map_banks[idx]->size = flash_size; - map_banks[idx]->bankwidth = 4; - - simple_map_init(map_banks[idx]); - - map_banks[idx]->virt = start_scan_addr; - map_banks[idx]->phys = flash_addr; - /* FIXME: This looks utterly bogus, but I'm trying to - preserve the behaviour of the original (shown here)... - - map_banks[idx]->map_priv_1 = - start_scan_addr + ((idx > 0) ? - (mtd_banks[idx-1] ? mtd_banks[idx-1]->size : 0) : 0); - */ - - if (idx && mtd_banks[idx-1]) { - map_banks[idx]->virt += mtd_banks[idx-1]->size; - map_banks[idx]->phys += mtd_banks[idx-1]->size; - } - - //start to probe flash chips - mtd_banks[idx] = do_map_probe("cfi_probe", map_banks[idx]); - - if (mtd_banks[idx]) { - mtd_banks[idx]->owner = THIS_MODULE; - mtd_size += mtd_banks[idx]->size; - num_banks++; - - printk(KERN_INFO "%s: bank%d, name:%s, size:%dbytes \n", __func__, num_banks, - mtd_banks[idx]->name, mtd_banks[idx]->size); - } - } - - /* no supported flash chips found */ - if (!num_banks) { - printk(KERN_NOTICE "TQM8xxL: No support flash chips found!\n"); - ret = -ENXIO; - goto error_mem; - } - - /* - * Select Static partition definitions - */ - part_banks[0].mtd_part = tqm8xxl_partitions; - part_banks[0].type = "Static image"; - part_banks[0].nums = ARRAY_SIZE(tqm8xxl_partitions); - - part_banks[1].mtd_part = tqm8xxl_fs_partitions; - part_banks[1].type = "Static file system"; - part_banks[1].nums = ARRAY_SIZE(tqm8xxl_fs_partitions); - - for(idx = 0; idx < num_banks ; idx++) { - if (part_banks[idx].nums == 0) - printk(KERN_NOTICE "TQM flash%d: no partition info available, registering whole flash at once\n", idx); - else - printk(KERN_NOTICE "TQM flash%d: Using %s partition definition\n", - idx, part_banks[idx].type); - mtd_device_register(mtd_banks[idx], part_banks[idx].mtd_part, - part_banks[idx].nums); - } - return 0; -error_mem: - for(idx = 0 ; idx < FLASH_BANK_MAX ; idx++) { - if(map_banks[idx] != NULL) { - kfree(map_banks[idx]->name); - map_banks[idx]->name = NULL; - kfree(map_banks[idx]); - map_banks[idx] = NULL; - } - } -error: - iounmap(start_scan_addr); - return ret; -} - -static void __exit cleanup_tqm_mtd(void) -{ - unsigned int idx = 0; - for(idx = 0 ; idx < num_banks ; idx++) { - /* destroy mtd_info previously allocated */ - if (mtd_banks[idx]) { - mtd_device_unregister(mtd_banks[idx]); - map_destroy(mtd_banks[idx]); - } - /* release map_info not used anymore */ - kfree(map_banks[idx]->name); - kfree(map_banks[idx]); - } - - if (start_scan_addr) { - iounmap(start_scan_addr); - start_scan_addr = 0; - } -} - -module_init(init_tqm_mtd); -module_exit(cleanup_tqm_mtd); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Kirk Lee "); -MODULE_DESCRIPTION("MTD map driver for TQM8xxL boards"); diff --git a/drivers/mtd/maps/tsunami_flash.c b/drivers/mtd/maps/tsunami_flash.c index 1de390e1c2fb..da2cdb5fd6db 100644 --- a/drivers/mtd/maps/tsunami_flash.c +++ b/drivers/mtd/maps/tsunami_flash.c @@ -82,11 +82,12 @@ static void __exit cleanup_tsunami_flash(void) tsunami_flash_mtd = 0; } +static const char * const rom_probe_types[] = { + "cfi_probe", "jedec_probe", "map_rom", NULL }; static int __init init_tsunami_flash(void) { - static const char *rom_probe_types[] = { "cfi_probe", "jedec_probe", "map_rom", NULL }; - char **type; + const char * const *type; tsunami_tig_writeb(FLASH_ENABLE_BYTE, FLASH_ENABLE_PORT); diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c index dc571ebc1aa0..c719879284bd 100644 --- a/drivers/mtd/mtdchar.c +++ b/drivers/mtd/mtdchar.c @@ -38,6 +38,8 @@ #include +#include "mtdcore.h" + static DEFINE_MUTEX(mtd_mutex); /* @@ -365,37 +367,35 @@ static void mtdchar_erase_callback (struct erase_info *instr) wake_up((wait_queue_head_t *)instr->priv); } -#ifdef CONFIG_HAVE_MTD_OTP static int otp_select_filemode(struct mtd_file_info *mfi, int mode) { struct mtd_info *mtd = mfi->mtd; size_t retlen; - int ret = 0; - - /* - * Make a fake call to mtd_read_fact_prot_reg() to check if OTP - * operations are supported. - */ - if (mtd_read_fact_prot_reg(mtd, -1, 0, &retlen, NULL) == -EOPNOTSUPP) - return -EOPNOTSUPP; switch (mode) { case MTD_OTP_FACTORY: + if (mtd_read_fact_prot_reg(mtd, -1, 0, &retlen, NULL) == + -EOPNOTSUPP) + return -EOPNOTSUPP; + mfi->mode = MTD_FILE_MODE_OTP_FACTORY; break; case MTD_OTP_USER: + if (mtd_read_user_prot_reg(mtd, -1, 0, &retlen, NULL) == + -EOPNOTSUPP) + return -EOPNOTSUPP; + mfi->mode = MTD_FILE_MODE_OTP_USER; break; - default: - ret = -EINVAL; case MTD_OTP_OFF: + mfi->mode = MTD_FILE_MODE_NORMAL; break; + default: + return -EINVAL; } - return ret; + + return 0; } -#else -# define otp_select_filemode(f,m) -EOPNOTSUPP -#endif static int mtdchar_writeoob(struct file *file, struct mtd_info *mtd, uint64_t start, uint32_t length, void __user *ptr, @@ -888,7 +888,6 @@ static int mtdchar_ioctl(struct file *file, u_int cmd, u_long arg) break; } -#ifdef CONFIG_HAVE_MTD_OTP case OTPSELECT: { int mode; @@ -944,7 +943,6 @@ static int mtdchar_ioctl(struct file *file, u_int cmd, u_long arg) ret = mtd_lock_user_prot_reg(mtd, oinfo.start, oinfo.length); break; } -#endif /* This ioctl is being deprecated - it truncates the ECC layout */ case ECCGETLAYOUT: @@ -1185,23 +1183,25 @@ static struct file_system_type mtd_inodefs_type = { }; MODULE_ALIAS_FS("mtd_inodefs"); -static int __init init_mtdchar(void) +int __init init_mtdchar(void) { int ret; ret = __register_chrdev(MTD_CHAR_MAJOR, 0, 1 << MINORBITS, "mtd", &mtd_fops); if (ret < 0) { - pr_notice("Can't allocate major number %d for " - "Memory Technology Devices.\n", MTD_CHAR_MAJOR); + pr_err("Can't allocate major number %d for MTD\n", + MTD_CHAR_MAJOR); return ret; } ret = register_filesystem(&mtd_inodefs_type); if (ret) { - pr_notice("Can't register mtd_inodefs filesystem: %d\n", ret); + pr_err("Can't register mtd_inodefs filesystem, error %d\n", + ret); goto err_unregister_chdev; } + return ret; err_unregister_chdev: @@ -1209,18 +1209,10 @@ err_unregister_chdev: return ret; } -static void __exit cleanup_mtdchar(void) +void __exit cleanup_mtdchar(void) { unregister_filesystem(&mtd_inodefs_type); __unregister_chrdev(MTD_CHAR_MAJOR, 0, 1 << MINORBITS, "mtd"); } -module_init(init_mtdchar); -module_exit(cleanup_mtdchar); - -MODULE_ALIAS_CHARDEV_MAJOR(MTD_CHAR_MAJOR); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("David Woodhouse "); -MODULE_DESCRIPTION("Direct character-device access to MTD devices"); MODULE_ALIAS_CHARDEV_MAJOR(MTD_CHAR_MAJOR); diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c index 322ca65b0cc5..c400c57c394a 100644 --- a/drivers/mtd/mtdcore.c +++ b/drivers/mtd/mtdcore.c @@ -42,6 +42,7 @@ #include #include "mtdcore.h" + /* * backing device capabilities for non-mappable devices (such as NAND flash) * - permits private mappings, copies are taken of the data @@ -97,11 +98,7 @@ EXPORT_SYMBOL_GPL(__mtd_next_device); static LIST_HEAD(mtd_notifiers); -#if defined(CONFIG_MTD_CHAR) || defined(CONFIG_MTD_CHAR_MODULE) #define MTD_DEVT(index) MKDEV(MTD_CHAR_MAJOR, (index)*2) -#else -#define MTD_DEVT(index) 0 -#endif /* REVISIT once MTD uses the driver model better, whoever allocates * the mtd_info will probably want to use the release() hook... @@ -493,7 +490,7 @@ out_error: * * Returns zero in case of success and a negative error code in case of failure. */ -int mtd_device_parse_register(struct mtd_info *mtd, const char **types, +int mtd_device_parse_register(struct mtd_info *mtd, const char * const *types, struct mtd_part_parser_data *parser_data, const struct mtd_partition *parts, int nr_parts) @@ -1117,8 +1114,6 @@ EXPORT_SYMBOL_GPL(mtd_kmalloc_up_to); /*====================================================================*/ /* Support for /proc/mtd */ -static struct proc_dir_entry *proc_mtd; - static int mtd_proc_show(struct seq_file *m, void *v) { struct mtd_info *mtd; @@ -1164,6 +1159,8 @@ static int __init mtd_bdi_init(struct backing_dev_info *bdi, const char *name) return ret; } +static struct proc_dir_entry *proc_mtd; + static int __init init_mtd(void) { int ret; @@ -1184,11 +1181,17 @@ static int __init init_mtd(void) if (ret) goto err_bdi3; -#ifdef CONFIG_PROC_FS proc_mtd = proc_create("mtd", 0, NULL, &mtd_proc_ops); -#endif /* CONFIG_PROC_FS */ + + ret = init_mtdchar(); + if (ret) + goto out_procfs; + return 0; +out_procfs: + if (proc_mtd) + remove_proc_entry("mtd", NULL); err_bdi3: bdi_destroy(&mtd_bdi_ro_mappable); err_bdi2: @@ -1202,10 +1205,9 @@ err_reg: static void __exit cleanup_mtd(void) { -#ifdef CONFIG_PROC_FS + cleanup_mtdchar(); if (proc_mtd) - remove_proc_entry( "mtd", NULL); -#endif /* CONFIG_PROC_FS */ + remove_proc_entry("mtd", NULL); class_unregister(&mtd_class); bdi_destroy(&mtd_bdi_unmappable); bdi_destroy(&mtd_bdi_ro_mappable); diff --git a/drivers/mtd/mtdcore.h b/drivers/mtd/mtdcore.h index 961a38408542..7b0353399a10 100644 --- a/drivers/mtd/mtdcore.h +++ b/drivers/mtd/mtdcore.h @@ -1,23 +1,21 @@ -/* linux/drivers/mtd/mtdcore.h - * - * Header file for driver private mtdcore exports - * +/* + * These are exported solely for the purpose of mtd_blkdevs.c and mtdchar.c. + * You should not use them for _anything_ else. */ -/* These are exported solely for the purpose of mtd_blkdevs.c. You - should not use them for _anything_ else */ - extern struct mutex mtd_table_mutex; -extern struct mtd_info *__mtd_next_device(int i); -extern int add_mtd_device(struct mtd_info *mtd); -extern int del_mtd_device(struct mtd_info *mtd); -extern int add_mtd_partitions(struct mtd_info *, const struct mtd_partition *, - int); -extern int del_mtd_partitions(struct mtd_info *); -extern int parse_mtd_partitions(struct mtd_info *master, const char **types, - struct mtd_partition **pparts, - struct mtd_part_parser_data *data); +struct mtd_info *__mtd_next_device(int i); +int add_mtd_device(struct mtd_info *mtd); +int del_mtd_device(struct mtd_info *mtd); +int add_mtd_partitions(struct mtd_info *, const struct mtd_partition *, int); +int del_mtd_partitions(struct mtd_info *); +int parse_mtd_partitions(struct mtd_info *master, const char * const *types, + struct mtd_partition **pparts, + struct mtd_part_parser_data *data); + +int __init init_mtdchar(void); +void __exit cleanup_mtdchar(void); #define mtd_for_each_device(mtd) \ for ((mtd) = __mtd_next_device(0); \ diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c index 70fa70a8318f..301493382cd0 100644 --- a/drivers/mtd/mtdpart.c +++ b/drivers/mtd/mtdpart.c @@ -694,7 +694,7 @@ EXPORT_SYMBOL_GPL(deregister_mtd_parser); * Do not forget to update 'parse_mtd_partitions()' kerneldoc comment if you * are changing this array! */ -static const char *default_mtd_part_types[] = { +static const char * const default_mtd_part_types[] = { "cmdlinepart", "ofpart", NULL @@ -720,7 +720,7 @@ static const char *default_mtd_part_types[] = { * o a positive number of found partitions, in which case on exit @pparts will * point to an array containing this number of &struct mtd_info objects. */ -int parse_mtd_partitions(struct mtd_info *master, const char **types, +int parse_mtd_partitions(struct mtd_info *master, const char *const *types, struct mtd_partition **pparts, struct mtd_part_parser_data *data) { diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig index 81bf5e52601e..a60f6c17f57b 100644 --- a/drivers/mtd/nand/Kconfig +++ b/drivers/mtd/nand/Kconfig @@ -41,14 +41,6 @@ config MTD_SM_COMMON tristate default n -config MTD_NAND_MUSEUM_IDS - bool "Enable chip ids for obsolete ancient NAND devices" - default n - help - Enable this option only when your board has first generation - NAND chips (page size 256 byte, erase size 4-8KiB). The IDs - of these chips were reused by later, larger chips. - config MTD_NAND_DENALI tristate "Support Denali NAND controller" help @@ -81,15 +73,9 @@ config MTD_NAND_DENALI_SCRATCH_REG_ADDR scratch register here to enable this feature. On Intel Moorestown boards, the scratch register is at 0xFF108018. -config MTD_NAND_H1900 - tristate "iPAQ H1900 flash" - depends on ARCH_PXA && BROKEN - help - This enables the driver for the iPAQ h1900 flash. - config MTD_NAND_GPIO tristate "GPIO NAND Flash driver" - depends on GENERIC_GPIO && ARM + depends on GPIOLIB && ARM help This enables a GPIO based NAND flash driver. @@ -201,22 +187,6 @@ config MTD_NAND_BF5XX_BOOTROM_ECC If unsure, say N. -config MTD_NAND_RTC_FROM4 - tristate "Renesas Flash ROM 4-slot interface board (FROM_BOARD4)" - depends on SH_SOLUTION_ENGINE - select REED_SOLOMON - select REED_SOLOMON_DEC8 - select BITREVERSE - help - This enables the driver for the Renesas Technology AG-AND - flash interface board (FROM_BOARD4) - -config MTD_NAND_PPCHAMELEONEVB - tristate "NAND Flash device on PPChameleonEVB board" - depends on PPCHAMELEONEVB && BROKEN - help - This enables the NAND flash driver on the PPChameleon EVB Board. - config MTD_NAND_S3C2410 tristate "NAND Flash support for Samsung S3C SoCs" depends on ARCH_S3C24XX || ARCH_S3C64XX diff --git a/drivers/mtd/nand/Makefile b/drivers/mtd/nand/Makefile index d76d91205691..bb8189172f62 100644 --- a/drivers/mtd/nand/Makefile +++ b/drivers/mtd/nand/Makefile @@ -15,14 +15,11 @@ obj-$(CONFIG_MTD_NAND_DENALI_PCI) += denali_pci.o obj-$(CONFIG_MTD_NAND_DENALI_DT) += denali_dt.o obj-$(CONFIG_MTD_NAND_AU1550) += au1550nd.o obj-$(CONFIG_MTD_NAND_BF5XX) += bf5xx_nand.o -obj-$(CONFIG_MTD_NAND_PPCHAMELEONEVB) += ppchameleonevb.o obj-$(CONFIG_MTD_NAND_S3C2410) += s3c2410.o obj-$(CONFIG_MTD_NAND_DAVINCI) += davinci_nand.o obj-$(CONFIG_MTD_NAND_DISKONCHIP) += diskonchip.o obj-$(CONFIG_MTD_NAND_DOCG4) += docg4.o obj-$(CONFIG_MTD_NAND_FSMC) += fsmc_nand.o -obj-$(CONFIG_MTD_NAND_H1900) += h1910.o -obj-$(CONFIG_MTD_NAND_RTC_FROM4) += rtc_from4.o obj-$(CONFIG_MTD_NAND_SHARPSL) += sharpsl.o obj-$(CONFIG_MTD_NAND_NANDSIM) += nandsim.o obj-$(CONFIG_MTD_NAND_CS553X) += cs553x_nand.o diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c index ffcbcca2fd2d..2d23d2929438 100644 --- a/drivers/mtd/nand/atmel_nand.c +++ b/drivers/mtd/nand/atmel_nand.c @@ -1737,20 +1737,7 @@ static struct platform_driver atmel_nand_driver = { }, }; -static int __init atmel_nand_init(void) -{ - return platform_driver_probe(&atmel_nand_driver, atmel_nand_probe); -} - - -static void __exit atmel_nand_exit(void) -{ - platform_driver_unregister(&atmel_nand_driver); -} - - -module_init(atmel_nand_init); -module_exit(atmel_nand_exit); +module_platform_driver_probe(atmel_nand_driver, atmel_nand_probe); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Rick Bronson"); diff --git a/drivers/mtd/nand/bf5xx_nand.c b/drivers/mtd/nand/bf5xx_nand.c index 4271e948d1e2..776df3694f75 100644 --- a/drivers/mtd/nand/bf5xx_nand.c +++ b/drivers/mtd/nand/bf5xx_nand.c @@ -874,21 +874,7 @@ static struct platform_driver bf5xx_nand_driver = { }, }; -static int __init bf5xx_nand_init(void) -{ - printk(KERN_INFO "%s, Version %s (c) 2007 Analog Devices, Inc.\n", - DRV_DESC, DRV_VERSION); - - return platform_driver_register(&bf5xx_nand_driver); -} - -static void __exit bf5xx_nand_exit(void) -{ - platform_driver_unregister(&bf5xx_nand_driver); -} - -module_init(bf5xx_nand_init); -module_exit(bf5xx_nand_exit); +module_platform_driver(bf5xx_nand_driver); MODULE_LICENSE("GPL"); MODULE_AUTHOR(DRV_AUTHOR); diff --git a/drivers/mtd/nand/cafe_nand.c b/drivers/mtd/nand/cafe_nand.c index 010d61266536..c34985a55101 100644 --- a/drivers/mtd/nand/cafe_nand.c +++ b/drivers/mtd/nand/cafe_nand.c @@ -303,13 +303,7 @@ static void cafe_nand_cmdfunc(struct mtd_info *mtd, unsigned command, case NAND_CMD_SEQIN: case NAND_CMD_RNDIN: case NAND_CMD_STATUS: - case NAND_CMD_DEPLETE1: case NAND_CMD_RNDOUT: - case NAND_CMD_STATUS_ERROR: - case NAND_CMD_STATUS_ERROR0: - case NAND_CMD_STATUS_ERROR1: - case NAND_CMD_STATUS_ERROR2: - case NAND_CMD_STATUS_ERROR3: cafe_writel(cafe, cafe->ctl2, NAND_CTRL2); return; } @@ -536,8 +530,8 @@ static int cafe_nand_write_page_lowlevel(struct mtd_info *mtd, } static int cafe_nand_write_page(struct mtd_info *mtd, struct nand_chip *chip, - const uint8_t *buf, int oob_required, int page, - int cached, int raw) + uint32_t offset, int data_len, const uint8_t *buf, + int oob_required, int page, int cached, int raw) { int status; diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c index 94e17af8e450..c3e15a558173 100644 --- a/drivers/mtd/nand/davinci_nand.c +++ b/drivers/mtd/nand/davinci_nand.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -577,7 +578,6 @@ static struct davinci_nand_pdata return pdev->dev.platform_data; } #else -#define davinci_nand_of_match NULL static struct davinci_nand_pdata *nand_davinci_get_pdata(struct platform_device *pdev) { @@ -878,22 +878,12 @@ static struct platform_driver nand_davinci_driver = { .driver = { .name = "davinci_nand", .owner = THIS_MODULE, - .of_match_table = davinci_nand_of_match, + .of_match_table = of_match_ptr(davinci_nand_of_match), }, }; MODULE_ALIAS("platform:davinci_nand"); -static int __init nand_davinci_init(void) -{ - return platform_driver_probe(&nand_davinci_driver, nand_davinci_probe); -} -module_init(nand_davinci_init); - -static void __exit nand_davinci_exit(void) -{ - platform_driver_unregister(&nand_davinci_driver); -} -module_exit(nand_davinci_exit); +module_platform_driver_probe(nand_davinci_driver, nand_davinci_probe); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Texas Instruments"); diff --git a/drivers/mtd/nand/denali_dt.c b/drivers/mtd/nand/denali_dt.c index 546f8cb5688d..92530244e2cb 100644 --- a/drivers/mtd/nand/denali_dt.c +++ b/drivers/mtd/nand/denali_dt.c @@ -42,7 +42,7 @@ static void __iomem *request_and_map(struct device *dev, } ptr = devm_ioremap_nocache(dev, res->start, resource_size(res)); - if (!res) + if (!ptr) dev_err(dev, "ioremap_nocache of %s failed!", res->name); return ptr; @@ -90,7 +90,7 @@ static int denali_dt_probe(struct platform_device *ofdev) denali->irq = platform_get_irq(ofdev, 0); if (denali->irq < 0) { dev_err(&ofdev->dev, "no irq defined\n"); - return -ENXIO; + return denali->irq; } denali->flash_reg = request_and_map(&ofdev->dev, denali_reg); @@ -146,21 +146,11 @@ static struct platform_driver denali_dt_driver = { .driver = { .name = "denali-nand-dt", .owner = THIS_MODULE, - .of_match_table = of_match_ptr(denali_nand_dt_ids), + .of_match_table = denali_nand_dt_ids, }, }; -static int __init denali_init_dt(void) -{ - return platform_driver_register(&denali_dt_driver); -} -module_init(denali_init_dt); - -static void __exit denali_exit_dt(void) -{ - platform_driver_unregister(&denali_dt_driver); -} -module_exit(denali_exit_dt); +module_platform_driver(denali_dt_driver); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Jamie Iles"); diff --git a/drivers/mtd/nand/docg4.c b/drivers/mtd/nand/docg4.c index 18fa4489e52e..fa25e7a08134 100644 --- a/drivers/mtd/nand/docg4.c +++ b/drivers/mtd/nand/docg4.c @@ -1397,18 +1397,7 @@ static struct platform_driver docg4_driver = { .remove = __exit_p(cleanup_docg4), }; -static int __init docg4_init(void) -{ - return platform_driver_probe(&docg4_driver, probe_docg4); -} - -static void __exit docg4_exit(void) -{ - platform_driver_unregister(&docg4_driver); -} - -module_init(docg4_init); -module_exit(docg4_exit); +module_platform_driver_probe(docg4_driver, probe_docg4); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Mike Dunn"); diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c index 05ba3f0c2d19..911e2433fe30 100644 --- a/drivers/mtd/nand/fsmc_nand.c +++ b/drivers/mtd/nand/fsmc_nand.c @@ -1235,18 +1235,7 @@ static struct platform_driver fsmc_nand_driver = { }, }; -static int __init fsmc_nand_init(void) -{ - return platform_driver_probe(&fsmc_nand_driver, - fsmc_nand_probe); -} -module_init(fsmc_nand_init); - -static void __exit fsmc_nand_exit(void) -{ - platform_driver_unregister(&fsmc_nand_driver); -} -module_exit(fsmc_nand_exit); +module_platform_driver_probe(fsmc_nand_driver, fsmc_nand_probe); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Vipin Kumar , Ashish Priyadarshi"); diff --git a/drivers/mtd/nand/gpio.c b/drivers/mtd/nand/gpio.c index e789e3f51710..89065dd83d64 100644 --- a/drivers/mtd/nand/gpio.c +++ b/drivers/mtd/nand/gpio.c @@ -190,7 +190,6 @@ static struct resource *gpio_nand_get_io_sync_of(struct platform_device *pdev) return r; } #else /* CONFIG_OF */ -#define gpio_nand_id_table NULL static inline int gpio_nand_get_config_of(const struct device *dev, struct gpio_nand_platdata *plat) { @@ -259,8 +258,6 @@ static int gpio_nand_remove(struct platform_device *dev) if (gpio_is_valid(gpiomtd->plat.gpio_rdy)) gpio_free(gpiomtd->plat.gpio_rdy); - kfree(gpiomtd); - return 0; } @@ -297,7 +294,7 @@ static int gpio_nand_probe(struct platform_device *dev) if (!res0) return -EINVAL; - gpiomtd = kzalloc(sizeof(*gpiomtd), GFP_KERNEL); + gpiomtd = devm_kzalloc(&dev->dev, sizeof(*gpiomtd), GFP_KERNEL); if (gpiomtd == NULL) { dev_err(&dev->dev, "failed to create NAND MTD\n"); return -ENOMEM; @@ -412,7 +409,6 @@ err_sync: iounmap(gpiomtd->nand_chip.IO_ADDR_R); release_mem_region(res0->start, resource_size(res0)); err_map: - kfree(gpiomtd); return ret; } @@ -421,7 +417,7 @@ static struct platform_driver gpio_nand_driver = { .remove = gpio_nand_remove, .driver = { .name = "gpio-nand", - .of_match_table = gpio_nand_id_table, + .of_match_table = of_match_ptr(gpio_nand_id_table), }, }; diff --git a/drivers/mtd/nand/h1910.c b/drivers/mtd/nand/h1910.c deleted file mode 100644 index 50166e93ba96..000000000000 --- a/drivers/mtd/nand/h1910.c +++ /dev/null @@ -1,167 +0,0 @@ -/* - * drivers/mtd/nand/h1910.c - * - * Copyright (C) 2003 Joshua Wise (joshua@joshuawise.com) - * - * Derived from drivers/mtd/nand/edb7312.c - * Copyright (C) 2002 Marius Gröger (mag@sysgo.de) - * Copyright (c) 2001 Thomas Gleixner (gleixner@autronix.de) - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * Overview: - * This is a device driver for the NAND flash device found on the - * iPAQ h1910 board which utilizes the Samsung K9F2808 part. This is - * a 128Mibit (16MiB x 8 bits) NAND flash device. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* - * MTD structure for EDB7312 board - */ -static struct mtd_info *h1910_nand_mtd = NULL; - -/* - * Module stuff - */ - -/* - * Define static partitions for flash device - */ -static struct mtd_partition partition_info[] = { - {name:"h1910 NAND Flash", - offset:0, - size:16 * 1024 * 1024} -}; - -#define NUM_PARTITIONS 1 - -/* - * hardware specific access to control-lines - * - * NAND_NCE: bit 0 - don't care - * NAND_CLE: bit 1 - address bit 2 - * NAND_ALE: bit 2 - address bit 3 - */ -static void h1910_hwcontrol(struct mtd_info *mtd, int cmd, - unsigned int ctrl) -{ - struct nand_chip *chip = mtd->priv; - - if (cmd != NAND_CMD_NONE) - writeb(cmd, chip->IO_ADDR_W | ((ctrl & 0x6) << 1)); -} - -/* - * read device ready pin - */ -#if 0 -static int h1910_device_ready(struct mtd_info *mtd) -{ - return (GPLR(55) & GPIO_bit(55)); -} -#endif - -/* - * Main initialization routine - */ -static int __init h1910_init(void) -{ - struct nand_chip *this; - void __iomem *nandaddr; - - if (!machine_is_h1900()) - return -ENODEV; - - nandaddr = ioremap(0x08000000, 0x1000); - if (!nandaddr) { - printk("Failed to ioremap nand flash.\n"); - return -ENOMEM; - } - - /* Allocate memory for MTD device structure and private data */ - h1910_nand_mtd = kmalloc(sizeof(struct mtd_info) + sizeof(struct nand_chip), GFP_KERNEL); - if (!h1910_nand_mtd) { - printk("Unable to allocate h1910 NAND MTD device structure.\n"); - iounmap((void *)nandaddr); - return -ENOMEM; - } - - /* Get pointer to private data */ - this = (struct nand_chip *)(&h1910_nand_mtd[1]); - - /* Initialize structures */ - memset(h1910_nand_mtd, 0, sizeof(struct mtd_info)); - memset(this, 0, sizeof(struct nand_chip)); - - /* Link the private data with the MTD structure */ - h1910_nand_mtd->priv = this; - h1910_nand_mtd->owner = THIS_MODULE; - - /* - * Enable VPEN - */ - GPSR(37) = GPIO_bit(37); - - /* insert callbacks */ - this->IO_ADDR_R = nandaddr; - this->IO_ADDR_W = nandaddr; - this->cmd_ctrl = h1910_hwcontrol; - this->dev_ready = NULL; /* unknown whether that was correct or not so we will just do it like this */ - /* 15 us command delay time */ - this->chip_delay = 50; - this->ecc.mode = NAND_ECC_SOFT; - - /* Scan to find existence of the device */ - if (nand_scan(h1910_nand_mtd, 1)) { - printk(KERN_NOTICE "No NAND device - returning -ENXIO\n"); - kfree(h1910_nand_mtd); - iounmap((void *)nandaddr); - return -ENXIO; - } - - /* Register the partitions */ - mtd_device_parse_register(h1910_nand_mtd, NULL, NULL, partition_info, - NUM_PARTITIONS); - - /* Return happy */ - return 0; -} - -module_init(h1910_init); - -/* - * Clean up routine - */ -static void __exit h1910_cleanup(void) -{ - struct nand_chip *this = (struct nand_chip *)&h1910_nand_mtd[1]; - - /* Release resources, unregister device */ - nand_release(h1910_nand_mtd); - - /* Release io resource */ - iounmap((void *)this->IO_ADDR_W); - - /* Free the MTD device structure */ - kfree(h1910_nand_mtd); -} - -module_exit(h1910_cleanup); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Joshua Wise "); -MODULE_DESCRIPTION("NAND flash driver for iPAQ h1910"); diff --git a/drivers/mtd/nand/lpc32xx_mlc.c b/drivers/mtd/nand/lpc32xx_mlc.c index 0ca22ae9135c..a94facb46e5c 100644 --- a/drivers/mtd/nand/lpc32xx_mlc.c +++ b/drivers/mtd/nand/lpc32xx_mlc.c @@ -540,8 +540,8 @@ static int lpc32xx_write_page_lowlevel(struct mtd_info *mtd, } static int lpc32xx_write_page(struct mtd_info *mtd, struct nand_chip *chip, - const uint8_t *buf, int oob_required, int page, - int cached, int raw) + uint32_t offset, int data_len, const uint8_t *buf, + int oob_required, int page, int cached, int raw) { int res; diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c index 42c63927609d..dfcd0a565c5b 100644 --- a/drivers/mtd/nand/nand_base.c +++ b/drivers/mtd/nand/nand_base.c @@ -4,7 +4,6 @@ * Overview: * This is the generic MTD driver for NAND flash devices. It should be * capable of working with almost all NAND chips currently available. - * Basic support for AG-AND chips is provided. * * Additional technical information is available on * http://www.linux-mtd.infradead.org/doc/nand.html @@ -22,8 +21,6 @@ * Enable cached programming for 2k page size chips * Check, if mtd->ecctype should be set to MTD_ECC_HW * if we have HW ECC support. - * The AG-AND chips have nice features for speed improvement, - * which are not supported yet. Read / program 4 pages in one go. * BBT table is not serialized, has to be fixed * * This program is free software; you can redistribute it and/or modify @@ -515,7 +512,7 @@ EXPORT_SYMBOL_GPL(nand_wait_ready); * @page_addr: the page address for this command, -1 if none * * Send command to NAND device. This function is used for small page devices - * (256/512 Bytes per page). + * (512 Bytes per page). */ static void nand_command(struct mtd_info *mtd, unsigned int command, int column, int page_addr) @@ -631,8 +628,7 @@ static void nand_command_lp(struct mtd_info *mtd, unsigned int command, } /* Command latch cycle */ - chip->cmd_ctrl(mtd, command & 0xff, - NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE); + chip->cmd_ctrl(mtd, command, NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE); if (column != -1 || page_addr != -1) { int ctrl = NAND_CTRL_CHANGE | NAND_NCE | NAND_ALE; @@ -671,16 +667,6 @@ static void nand_command_lp(struct mtd_info *mtd, unsigned int command, case NAND_CMD_SEQIN: case NAND_CMD_RNDIN: case NAND_CMD_STATUS: - case NAND_CMD_DEPLETE1: - return; - - case NAND_CMD_STATUS_ERROR: - case NAND_CMD_STATUS_ERROR0: - case NAND_CMD_STATUS_ERROR1: - case NAND_CMD_STATUS_ERROR2: - case NAND_CMD_STATUS_ERROR3: - /* Read error status commands require only a short delay */ - udelay(chip->chip_delay); return; case NAND_CMD_RESET: @@ -836,10 +822,7 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip) */ ndelay(100); - if ((state == FL_ERASING) && (chip->options & NAND_IS_AND)) - chip->cmdfunc(mtd, NAND_CMD_STATUS_MULTI, -1, -1); - else - chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1); + chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1); if (in_interrupt() || oops_in_progress) panic_nand_wait(mtd, chip, timeo); @@ -1127,7 +1110,7 @@ static int nand_read_page_swecc(struct mtd_info *mtd, struct nand_chip *chip, } /** - * nand_read_subpage - [REPLACEABLE] software ECC based sub-page read function + * nand_read_subpage - [REPLACEABLE] ECC based sub-page read function * @mtd: mtd info structure * @chip: nand chip info structure * @data_offs: offset of requested data within the page @@ -1995,6 +1978,67 @@ static int nand_write_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip, return 0; } + +/** + * nand_write_subpage_hwecc - [REPLACABLE] hardware ECC based subpage write + * @mtd: mtd info structure + * @chip: nand chip info structure + * @column: column address of subpage within the page + * @data_len: data length + * @oob_required: must write chip->oob_poi to OOB + */ +static int nand_write_subpage_hwecc(struct mtd_info *mtd, + struct nand_chip *chip, uint32_t offset, + uint32_t data_len, const uint8_t *data_buf, + int oob_required) +{ + uint8_t *oob_buf = chip->oob_poi; + uint8_t *ecc_calc = chip->buffers->ecccalc; + int ecc_size = chip->ecc.size; + int ecc_bytes = chip->ecc.bytes; + int ecc_steps = chip->ecc.steps; + uint32_t *eccpos = chip->ecc.layout->eccpos; + uint32_t start_step = offset / ecc_size; + uint32_t end_step = (offset + data_len - 1) / ecc_size; + int oob_bytes = mtd->oobsize / ecc_steps; + int step, i; + + for (step = 0; step < ecc_steps; step++) { + /* configure controller for WRITE access */ + chip->ecc.hwctl(mtd, NAND_ECC_WRITE); + + /* write data (untouched subpages already masked by 0xFF) */ + chip->write_buf(mtd, data_buf, ecc_size); + + /* mask ECC of un-touched subpages by padding 0xFF */ + if ((step < start_step) || (step > end_step)) + memset(ecc_calc, 0xff, ecc_bytes); + else + chip->ecc.calculate(mtd, data_buf, ecc_calc); + + /* mask OOB of un-touched subpages by padding 0xFF */ + /* if oob_required, preserve OOB metadata of written subpage */ + if (!oob_required || (step < start_step) || (step > end_step)) + memset(oob_buf, 0xff, oob_bytes); + + data_buf += ecc_size; + ecc_calc += ecc_bytes; + oob_buf += oob_bytes; + } + + /* copy calculated ECC for whole page to chip->buffer->oob */ + /* this include masked-value(0xFF) for unwritten subpages */ + ecc_calc = chip->buffers->ecccalc; + for (i = 0; i < chip->ecc.total; i++) + chip->oob_poi[eccpos[i]] = ecc_calc[i]; + + /* write OOB buffer to NAND device */ + chip->write_buf(mtd, chip->oob_poi, mtd->oobsize); + + return 0; +} + + /** * nand_write_page_syndrome - [REPLACEABLE] hardware ECC syndrome based page write * @mtd: mtd info structure @@ -2047,6 +2091,8 @@ static int nand_write_page_syndrome(struct mtd_info *mtd, * nand_write_page - [REPLACEABLE] write one page * @mtd: MTD device structure * @chip: NAND chip descriptor + * @offset: address offset within the page + * @data_len: length of actual data to be written * @buf: the data to write * @oob_required: must write chip->oob_poi to OOB * @page: page number to write @@ -2054,15 +2100,25 @@ static int nand_write_page_syndrome(struct mtd_info *mtd, * @raw: use _raw version of write_page */ static int nand_write_page(struct mtd_info *mtd, struct nand_chip *chip, - const uint8_t *buf, int oob_required, int page, - int cached, int raw) + uint32_t offset, int data_len, const uint8_t *buf, + int oob_required, int page, int cached, int raw) { - int status; + int status, subpage; + + if (!(chip->options & NAND_NO_SUBPAGE_WRITE) && + chip->ecc.write_subpage) + subpage = offset || (data_len < mtd->writesize); + else + subpage = 0; chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page); if (unlikely(raw)) - status = chip->ecc.write_page_raw(mtd, chip, buf, oob_required); + status = chip->ecc.write_page_raw(mtd, chip, buf, + oob_required); + else if (subpage) + status = chip->ecc.write_subpage(mtd, chip, offset, data_len, + buf, oob_required); else status = chip->ecc.write_page(mtd, chip, buf, oob_required); @@ -2075,7 +2131,7 @@ static int nand_write_page(struct mtd_info *mtd, struct nand_chip *chip, */ cached = 0; - if (!cached || !(chip->options & NAND_CACHEPRG)) { + if (!cached || !NAND_HAS_CACHEPROG(chip)) { chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1); status = chip->waitfunc(mtd, chip); @@ -2176,7 +2232,7 @@ static int nand_do_write_ops(struct mtd_info *mtd, loff_t to, uint8_t *oob = ops->oobbuf; uint8_t *buf = ops->datbuf; - int ret, subpage; + int ret; int oob_required = oob ? 1 : 0; ops->retlen = 0; @@ -2191,10 +2247,6 @@ static int nand_do_write_ops(struct mtd_info *mtd, loff_t to, } column = to & (mtd->writesize - 1); - subpage = column || (writelen & (mtd->writesize - 1)); - - if (subpage && oob) - return -EINVAL; chipnr = (int)(to >> chip->chip_shift); chip->select_chip(mtd, chipnr); @@ -2243,9 +2295,9 @@ static int nand_do_write_ops(struct mtd_info *mtd, loff_t to, /* We still need to erase leftover OOB data */ memset(chip->oob_poi, 0xff, mtd->oobsize); } - - ret = chip->write_page(mtd, chip, wbuf, oob_required, page, - cached, (ops->mode == MTD_OPS_RAW)); + ret = chip->write_page(mtd, chip, column, bytes, wbuf, + oob_required, page, cached, + (ops->mode == MTD_OPS_RAW)); if (ret) break; @@ -2480,24 +2532,6 @@ static void single_erase_cmd(struct mtd_info *mtd, int page) chip->cmdfunc(mtd, NAND_CMD_ERASE2, -1, -1); } -/** - * multi_erase_cmd - [GENERIC] AND specific block erase command function - * @mtd: MTD device structure - * @page: the page address of the block which will be erased - * - * AND multi block erase command function. Erase 4 consecutive blocks. - */ -static void multi_erase_cmd(struct mtd_info *mtd, int page) -{ - struct nand_chip *chip = mtd->priv; - /* Send commands to erase a block */ - chip->cmdfunc(mtd, NAND_CMD_ERASE1, -1, page++); - chip->cmdfunc(mtd, NAND_CMD_ERASE1, -1, page++); - chip->cmdfunc(mtd, NAND_CMD_ERASE1, -1, page++); - chip->cmdfunc(mtd, NAND_CMD_ERASE1, -1, page); - chip->cmdfunc(mtd, NAND_CMD_ERASE2, -1, -1); -} - /** * nand_erase - [MTD Interface] erase block(s) * @mtd: MTD device structure @@ -2510,7 +2544,6 @@ static int nand_erase(struct mtd_info *mtd, struct erase_info *instr) return nand_erase_nand(mtd, instr, 0); } -#define BBT_PAGE_MASK 0xffffff3f /** * nand_erase_nand - [INTERN] erase block(s) * @mtd: MTD device structure @@ -2524,8 +2557,6 @@ int nand_erase_nand(struct mtd_info *mtd, struct erase_info *instr, { int page, status, pages_per_block, ret, chipnr; struct nand_chip *chip = mtd->priv; - loff_t rewrite_bbt[NAND_MAX_CHIPS] = {0}; - unsigned int bbt_masked_page = 0xffffffff; loff_t len; pr_debug("%s: start = 0x%012llx, len = %llu\n", @@ -2556,15 +2587,6 @@ int nand_erase_nand(struct mtd_info *mtd, struct erase_info *instr, goto erase_exit; } - /* - * If BBT requires refresh, set the BBT page mask to see if the BBT - * should be rewritten. Otherwise the mask is set to 0xffffffff which - * can not be matched. This is also done when the bbt is actually - * erased to avoid recursive updates. - */ - if (chip->options & BBT_AUTO_REFRESH && !allowbbt) - bbt_masked_page = chip->bbt_td->pages[chipnr] & BBT_PAGE_MASK; - /* Loop through the pages */ len = instr->len; @@ -2610,15 +2632,6 @@ int nand_erase_nand(struct mtd_info *mtd, struct erase_info *instr, goto erase_exit; } - /* - * If BBT requires refresh, set the BBT rewrite flag to the - * page being erased. - */ - if (bbt_masked_page != 0xffffffff && - (page & BBT_PAGE_MASK) == bbt_masked_page) - rewrite_bbt[chipnr] = - ((loff_t)page << chip->page_shift); - /* Increment page address and decrement length */ len -= (1 << chip->phys_erase_shift); page += pages_per_block; @@ -2628,15 +2641,6 @@ int nand_erase_nand(struct mtd_info *mtd, struct erase_info *instr, chipnr++; chip->select_chip(mtd, -1); chip->select_chip(mtd, chipnr); - - /* - * If BBT requires refresh and BBT-PERCHIP, set the BBT - * page mask to see if this BBT should be rewritten. - */ - if (bbt_masked_page != 0xffffffff && - (chip->bbt_td->options & NAND_BBT_PERCHIP)) - bbt_masked_page = chip->bbt_td->pages[chipnr] & - BBT_PAGE_MASK; } } instr->state = MTD_ERASE_DONE; @@ -2653,23 +2657,6 @@ erase_exit: if (!ret) mtd_erase_callback(instr); - /* - * If BBT requires refresh and erase was successful, rewrite any - * selected bad block tables. - */ - if (bbt_masked_page == 0xffffffff || ret) - return ret; - - for (chipnr = 0; chipnr < chip->numchips; chipnr++) { - if (!rewrite_bbt[chipnr]) - continue; - /* Update the BBT for chip */ - pr_debug("%s: nand_update_bbt (%d:0x%0llx 0x%0x)\n", - __func__, chipnr, rewrite_bbt[chipnr], - chip->bbt_td->pages[chipnr]); - nand_update_bbt(mtd, rewrite_bbt[chipnr]); - } - /* Return more or less happy */ return ret; } @@ -2905,8 +2892,6 @@ static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip, chip->onfi_version = 20; else if (val & (1 << 1)) chip->onfi_version = 10; - else - chip->onfi_version = 0; if (!chip->onfi_version) { pr_info("%s: unsupported ONFI version: %d\n", __func__, val); @@ -3171,6 +3156,30 @@ static void nand_decode_bbm_options(struct mtd_info *mtd, chip->bbt_options |= NAND_BBT_SCAN2NDPAGE; } +static inline bool is_full_id_nand(struct nand_flash_dev *type) +{ + return type->id_len; +} + +static bool find_full_id_nand(struct mtd_info *mtd, struct nand_chip *chip, + struct nand_flash_dev *type, u8 *id_data, int *busw) +{ + if (!strncmp(type->id, id_data, type->id_len)) { + mtd->writesize = type->pagesize; + mtd->erasesize = type->erasesize; + mtd->oobsize = type->oobsize; + + chip->cellinfo = id_data[2]; + chip->chipsize = (uint64_t)type->chipsize << 20; + chip->options |= type->options; + + *busw = type->options & NAND_BUSWIDTH_16; + + return true; + } + return false; +} + /* * Get the flash and manufacturer id and lookup if the type is supported. */ @@ -3222,9 +3231,14 @@ static struct nand_flash_dev *nand_get_flash_type(struct mtd_info *mtd, if (!type) type = nand_flash_ids; - for (; type->name != NULL; type++) - if (*dev_id == type->id) - break; + for (; type->name != NULL; type++) { + if (is_full_id_nand(type)) { + if (find_full_id_nand(mtd, chip, type, id_data, &busw)) + goto ident_done; + } else if (*dev_id == type->dev_id) { + break; + } + } chip->onfi_version = 0; if (!type->name || !type->pagesize) { @@ -3302,12 +3316,7 @@ ident_done: } chip->badblockbits = 8; - - /* Check for AND chips with 4 page planes */ - if (chip->options & NAND_4PAGE_ARRAY) - chip->erase_cmd = multi_erase_cmd; - else - chip->erase_cmd = single_erase_cmd; + chip->erase_cmd = single_erase_cmd; /* Do not replace user supplied command function! */ if (mtd->writesize > 512 && chip->cmdfunc == nand_command) @@ -3474,6 +3483,10 @@ int nand_scan_tail(struct mtd_info *mtd) chip->ecc.read_oob = nand_read_oob_std; if (!chip->ecc.write_oob) chip->ecc.write_oob = nand_write_oob_std; + if (!chip->ecc.read_subpage) + chip->ecc.read_subpage = nand_read_subpage; + if (!chip->ecc.write_subpage) + chip->ecc.write_subpage = nand_write_subpage_hwecc; case NAND_ECC_HW_SYNDROME: if ((!chip->ecc.calculate || !chip->ecc.correct || diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c index 916d6e9c0ab1..267264320e06 100644 --- a/drivers/mtd/nand/nand_bbt.c +++ b/drivers/mtd/nand/nand_bbt.c @@ -1240,15 +1240,6 @@ int nand_update_bbt(struct mtd_info *mtd, loff_t offs) */ static uint8_t scan_ff_pattern[] = { 0xff, 0xff }; -static uint8_t scan_agand_pattern[] = { 0x1C, 0x71, 0xC7, 0x1C, 0x71, 0xC7 }; - -static struct nand_bbt_descr agand_flashbased = { - .options = NAND_BBT_SCANEMPTY | NAND_BBT_SCANALLPAGES, - .offs = 0x20, - .len = 6, - .pattern = scan_agand_pattern -}; - /* Generic flash bbt descriptors */ static uint8_t bbt_pattern[] = {'B', 'b', 't', '0' }; static uint8_t mirror_pattern[] = {'1', 't', 'b', 'B' }; @@ -1333,22 +1324,6 @@ int nand_default_bbt(struct mtd_info *mtd) { struct nand_chip *this = mtd->priv; - /* - * Default for AG-AND. We must use a flash based bad block table as the - * devices have factory marked _good_ blocks. Erasing those blocks - * leads to loss of the good / bad information, so we _must_ store this - * information in a good / bad table during startup. - */ - if (this->options & NAND_IS_AND) { - /* Use the default pattern descriptors */ - if (!this->bbt_td) { - this->bbt_td = &bbt_main_descr; - this->bbt_md = &bbt_mirror_descr; - } - this->bbt_options |= NAND_BBT_USE_FLASH; - return nand_scan_bbt(mtd, &agand_flashbased); - } - /* Is a flash based bad block table requested? */ if (this->bbt_options & NAND_BBT_USE_FLASH) { /* Use the default pattern descriptors */ diff --git a/drivers/mtd/nand/nand_ids.c b/drivers/mtd/nand/nand_ids.c index 9c612388e5de..683813a46a90 100644 --- a/drivers/mtd/nand/nand_ids.c +++ b/drivers/mtd/nand/nand_ids.c @@ -10,163 +10,153 @@ */ #include #include -/* -* Chip ID list -* -* Name. ID code, pagesize, chipsize in MegaByte, eraseblock size, -* options -* -* Pagesize; 0, 256, 512 -* 0 get this information from the extended chip ID -+ 256 256 Byte page size -* 512 512 Byte page size -*/ -struct nand_flash_dev nand_flash_ids[] = { +#include + +#define LP_OPTIONS NAND_SAMSUNG_LP_OPTIONS +#define LP_OPTIONS16 (LP_OPTIONS | NAND_BUSWIDTH_16) + #define SP_OPTIONS NAND_NEED_READRDY #define SP_OPTIONS16 (SP_OPTIONS | NAND_BUSWIDTH_16) -#ifdef CONFIG_MTD_NAND_MUSEUM_IDS - {"NAND 1MiB 5V 8-bit", 0x6e, 256, 1, 0x1000, SP_OPTIONS}, - {"NAND 2MiB 5V 8-bit", 0x64, 256, 2, 0x1000, SP_OPTIONS}, - {"NAND 4MiB 5V 8-bit", 0x6b, 512, 4, 0x2000, SP_OPTIONS}, - {"NAND 1MiB 3,3V 8-bit", 0xe8, 256, 1, 0x1000, SP_OPTIONS}, - {"NAND 1MiB 3,3V 8-bit", 0xec, 256, 1, 0x1000, SP_OPTIONS}, - {"NAND 2MiB 3,3V 8-bit", 0xea, 256, 2, 0x1000, SP_OPTIONS}, - {"NAND 4MiB 3,3V 8-bit", 0xd5, 512, 4, 0x2000, SP_OPTIONS}, - {"NAND 4MiB 3,3V 8-bit", 0xe3, 512, 4, 0x2000, SP_OPTIONS}, - {"NAND 4MiB 3,3V 8-bit", 0xe5, 512, 4, 0x2000, SP_OPTIONS}, - {"NAND 8MiB 3,3V 8-bit", 0xd6, 512, 8, 0x2000, SP_OPTIONS}, - - {"NAND 8MiB 1,8V 8-bit", 0x39, 512, 8, 0x2000, SP_OPTIONS}, - {"NAND 8MiB 3,3V 8-bit", 0xe6, 512, 8, 0x2000, SP_OPTIONS}, - {"NAND 8MiB 1,8V 16-bit", 0x49, 512, 8, 0x2000, SP_OPTIONS16}, - {"NAND 8MiB 3,3V 16-bit", 0x59, 512, 8, 0x2000, SP_OPTIONS16}, -#endif - - {"NAND 16MiB 1,8V 8-bit", 0x33, 512, 16, 0x4000, SP_OPTIONS}, - {"NAND 16MiB 3,3V 8-bit", 0x73, 512, 16, 0x4000, SP_OPTIONS}, - {"NAND 16MiB 1,8V 16-bit", 0x43, 512, 16, 0x4000, SP_OPTIONS16}, - {"NAND 16MiB 3,3V 16-bit", 0x53, 512, 16, 0x4000, SP_OPTIONS16}, - - {"NAND 32MiB 1,8V 8-bit", 0x35, 512, 32, 0x4000, SP_OPTIONS}, - {"NAND 32MiB 3,3V 8-bit", 0x75, 512, 32, 0x4000, SP_OPTIONS}, - {"NAND 32MiB 1,8V 16-bit", 0x45, 512, 32, 0x4000, SP_OPTIONS16}, - {"NAND 32MiB 3,3V 16-bit", 0x55, 512, 32, 0x4000, SP_OPTIONS16}, - - {"NAND 64MiB 1,8V 8-bit", 0x36, 512, 64, 0x4000, SP_OPTIONS}, - {"NAND 64MiB 3,3V 8-bit", 0x76, 512, 64, 0x4000, SP_OPTIONS}, - {"NAND 64MiB 1,8V 16-bit", 0x46, 512, 64, 0x4000, SP_OPTIONS16}, - {"NAND 64MiB 3,3V 16-bit", 0x56, 512, 64, 0x4000, SP_OPTIONS16}, - - {"NAND 128MiB 1,8V 8-bit", 0x78, 512, 128, 0x4000, SP_OPTIONS}, - {"NAND 128MiB 1,8V 8-bit", 0x39, 512, 128, 0x4000, SP_OPTIONS}, - {"NAND 128MiB 3,3V 8-bit", 0x79, 512, 128, 0x4000, SP_OPTIONS}, - {"NAND 128MiB 1,8V 16-bit", 0x72, 512, 128, 0x4000, SP_OPTIONS16}, - {"NAND 128MiB 1,8V 16-bit", 0x49, 512, 128, 0x4000, SP_OPTIONS16}, - {"NAND 128MiB 3,3V 16-bit", 0x74, 512, 128, 0x4000, SP_OPTIONS16}, - {"NAND 128MiB 3,3V 16-bit", 0x59, 512, 128, 0x4000, SP_OPTIONS16}, - - {"NAND 256MiB 3,3V 8-bit", 0x71, 512, 256, 0x4000, SP_OPTIONS}, +/* + * The chip ID list: + * name, device ID, page size, chip size in MiB, eraseblock size, options + * + * If page size and eraseblock size are 0, the sizes are taken from the + * extended chip ID. + */ +struct nand_flash_dev nand_flash_ids[] = { + /* + * Some incompatible NAND chips share device ID's and so must be + * listed by full ID. We list them first so that we can easily identify + * the most specific match. + */ + {"TC58NVG2S0F 4G 3.3V 8-bit", + { .id = {0x98, 0xdc, 0x90, 0x26, 0x76, 0x15, 0x01, 0x08} }, + SZ_4K, SZ_512, SZ_256K, 0, 8, 224}, + {"TC58NVG3S0F 8G 3.3V 8-bit", + { .id = {0x98, 0xd3, 0x90, 0x26, 0x76, 0x15, 0x02, 0x08} }, + SZ_4K, SZ_1K, SZ_256K, 0, 8, 232}, + {"TC58NVG5D2 32G 3.3V 8-bit", + { .id = {0x98, 0xd7, 0x94, 0x32, 0x76, 0x56, 0x09, 0x00} }, + SZ_8K, SZ_4K, SZ_1M, 0, 8, 640}, + {"TC58NVG6D2 64G 3.3V 8-bit", + { .id = {0x98, 0xde, 0x94, 0x82, 0x76, 0x56, 0x04, 0x20} }, + SZ_8K, SZ_8K, SZ_2M, 0, 8, 640}, + + LEGACY_ID_NAND("NAND 4MiB 5V 8-bit", 0x6B, 4, SZ_8K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 4MiB 3,3V 8-bit", 0xE3, 4, SZ_8K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 4MiB 3,3V 8-bit", 0xE5, 4, SZ_8K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 8MiB 3,3V 8-bit", 0xD6, 8, SZ_8K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 8MiB 3,3V 8-bit", 0xE6, 8, SZ_8K, SP_OPTIONS), + + LEGACY_ID_NAND("NAND 16MiB 1,8V 8-bit", 0x33, 16, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 16MiB 3,3V 8-bit", 0x73, 16, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 16MiB 1,8V 16-bit", 0x43, 16, SZ_16K, SP_OPTIONS16), + LEGACY_ID_NAND("NAND 16MiB 3,3V 16-bit", 0x53, 16, SZ_16K, SP_OPTIONS16), + + LEGACY_ID_NAND("NAND 32MiB 1,8V 8-bit", 0x35, 32, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 32MiB 3,3V 8-bit", 0x75, 32, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 32MiB 1,8V 16-bit", 0x45, 32, SZ_16K, SP_OPTIONS16), + LEGACY_ID_NAND("NAND 32MiB 3,3V 16-bit", 0x55, 32, SZ_16K, SP_OPTIONS16), + + LEGACY_ID_NAND("NAND 64MiB 1,8V 8-bit", 0x36, 64, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 64MiB 3,3V 8-bit", 0x76, 64, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 64MiB 1,8V 16-bit", 0x46, 64, SZ_16K, SP_OPTIONS16), + LEGACY_ID_NAND("NAND 64MiB 3,3V 16-bit", 0x56, 64, SZ_16K, SP_OPTIONS16), + + LEGACY_ID_NAND("NAND 128MiB 1,8V 8-bit", 0x78, 128, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 128MiB 1,8V 8-bit", 0x39, 128, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 128MiB 3,3V 8-bit", 0x79, 128, SZ_16K, SP_OPTIONS), + LEGACY_ID_NAND("NAND 128MiB 1,8V 16-bit", 0x72, 128, SZ_16K, SP_OPTIONS16), + LEGACY_ID_NAND("NAND 128MiB 1,8V 16-bit", 0x49, 128, SZ_16K, SP_OPTIONS16), + LEGACY_ID_NAND("NAND 128MiB 3,3V 16-bit", 0x74, 128, SZ_16K, SP_OPTIONS16), + LEGACY_ID_NAND("NAND 128MiB 3,3V 16-bit", 0x59, 128, SZ_16K, SP_OPTIONS16), + + LEGACY_ID_NAND("NAND 256MiB 3,3V 8-bit", 0x71, 256, SZ_16K, SP_OPTIONS), /* - * These are the new chips with large page size. The pagesize and the - * erasesize is determined from the extended id bytes + * These are the new chips with large page size. Their page size and + * eraseblock size are determined from the extended ID bytes. */ -#define LP_OPTIONS NAND_SAMSUNG_LP_OPTIONS -#define LP_OPTIONS16 (LP_OPTIONS | NAND_BUSWIDTH_16) /* 512 Megabit */ - {"NAND 64MiB 1,8V 8-bit", 0xA2, 0, 64, 0, LP_OPTIONS}, - {"NAND 64MiB 1,8V 8-bit", 0xA0, 0, 64, 0, LP_OPTIONS}, - {"NAND 64MiB 3,3V 8-bit", 0xF2, 0, 64, 0, LP_OPTIONS}, - {"NAND 64MiB 3,3V 8-bit", 0xD0, 0, 64, 0, LP_OPTIONS}, - {"NAND 64MiB 3,3V 8-bit", 0xF0, 0, 64, 0, LP_OPTIONS}, - {"NAND 64MiB 1,8V 16-bit", 0xB2, 0, 64, 0, LP_OPTIONS16}, - {"NAND 64MiB 1,8V 16-bit", 0xB0, 0, 64, 0, LP_OPTIONS16}, - {"NAND 64MiB 3,3V 16-bit", 0xC2, 0, 64, 0, LP_OPTIONS16}, - {"NAND 64MiB 3,3V 16-bit", 0xC0, 0, 64, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 64MiB 1,8V 8-bit", 0xA2, 64, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 64MiB 1,8V 8-bit", 0xA0, 64, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 64MiB 3,3V 8-bit", 0xF2, 64, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 64MiB 3,3V 8-bit", 0xD0, 64, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 64MiB 3,3V 8-bit", 0xF0, 64, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 64MiB 1,8V 16-bit", 0xB2, 64, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 64MiB 1,8V 16-bit", 0xB0, 64, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 64MiB 3,3V 16-bit", 0xC2, 64, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 64MiB 3,3V 16-bit", 0xC0, 64, LP_OPTIONS16), /* 1 Gigabit */ - {"NAND 128MiB 1,8V 8-bit", 0xA1, 0, 128, 0, LP_OPTIONS}, - {"NAND 128MiB 3,3V 8-bit", 0xF1, 0, 128, 0, LP_OPTIONS}, - {"NAND 128MiB 3,3V 8-bit", 0xD1, 0, 128, 0, LP_OPTIONS}, - {"NAND 128MiB 1,8V 16-bit", 0xB1, 0, 128, 0, LP_OPTIONS16}, - {"NAND 128MiB 3,3V 16-bit", 0xC1, 0, 128, 0, LP_OPTIONS16}, - {"NAND 128MiB 1,8V 16-bit", 0xAD, 0, 128, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 128MiB 1,8V 8-bit", 0xA1, 128, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 128MiB 3,3V 8-bit", 0xF1, 128, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 128MiB 3,3V 8-bit", 0xD1, 128, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 128MiB 1,8V 16-bit", 0xB1, 128, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 128MiB 3,3V 16-bit", 0xC1, 128, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 128MiB 1,8V 16-bit", 0xAD, 128, LP_OPTIONS16), /* 2 Gigabit */ - {"NAND 256MiB 1,8V 8-bit", 0xAA, 0, 256, 0, LP_OPTIONS}, - {"NAND 256MiB 3,3V 8-bit", 0xDA, 0, 256, 0, LP_OPTIONS}, - {"NAND 256MiB 1,8V 16-bit", 0xBA, 0, 256, 0, LP_OPTIONS16}, - {"NAND 256MiB 3,3V 16-bit", 0xCA, 0, 256, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 256MiB 1,8V 8-bit", 0xAA, 256, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 256MiB 3,3V 8-bit", 0xDA, 256, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 256MiB 1,8V 16-bit", 0xBA, 256, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 256MiB 3,3V 16-bit", 0xCA, 256, LP_OPTIONS16), /* 4 Gigabit */ - {"NAND 512MiB 1,8V 8-bit", 0xAC, 0, 512, 0, LP_OPTIONS}, - {"NAND 512MiB 3,3V 8-bit", 0xDC, 0, 512, 0, LP_OPTIONS}, - {"NAND 512MiB 1,8V 16-bit", 0xBC, 0, 512, 0, LP_OPTIONS16}, - {"NAND 512MiB 3,3V 16-bit", 0xCC, 0, 512, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 512MiB 1,8V 8-bit", 0xAC, 512, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 512MiB 3,3V 8-bit", 0xDC, 512, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 512MiB 1,8V 16-bit", 0xBC, 512, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 512MiB 3,3V 16-bit", 0xCC, 512, LP_OPTIONS16), /* 8 Gigabit */ - {"NAND 1GiB 1,8V 8-bit", 0xA3, 0, 1024, 0, LP_OPTIONS}, - {"NAND 1GiB 3,3V 8-bit", 0xD3, 0, 1024, 0, LP_OPTIONS}, - {"NAND 1GiB 1,8V 16-bit", 0xB3, 0, 1024, 0, LP_OPTIONS16}, - {"NAND 1GiB 3,3V 16-bit", 0xC3, 0, 1024, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 1GiB 1,8V 8-bit", 0xA3, 1024, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 1GiB 3,3V 8-bit", 0xD3, 1024, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 1GiB 1,8V 16-bit", 0xB3, 1024, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 1GiB 3,3V 16-bit", 0xC3, 1024, LP_OPTIONS16), /* 16 Gigabit */ - {"NAND 2GiB 1,8V 8-bit", 0xA5, 0, 2048, 0, LP_OPTIONS}, - {"NAND 2GiB 3,3V 8-bit", 0xD5, 0, 2048, 0, LP_OPTIONS}, - {"NAND 2GiB 1,8V 16-bit", 0xB5, 0, 2048, 0, LP_OPTIONS16}, - {"NAND 2GiB 3,3V 16-bit", 0xC5, 0, 2048, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 2GiB 1,8V 8-bit", 0xA5, 2048, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 2GiB 3,3V 8-bit", 0xD5, 2048, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 2GiB 1,8V 16-bit", 0xB5, 2048, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 2GiB 3,3V 16-bit", 0xC5, 2048, LP_OPTIONS16), /* 32 Gigabit */ - {"NAND 4GiB 1,8V 8-bit", 0xA7, 0, 4096, 0, LP_OPTIONS}, - {"NAND 4GiB 3,3V 8-bit", 0xD7, 0, 4096, 0, LP_OPTIONS}, - {"NAND 4GiB 1,8V 16-bit", 0xB7, 0, 4096, 0, LP_OPTIONS16}, - {"NAND 4GiB 3,3V 16-bit", 0xC7, 0, 4096, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 4GiB 1,8V 8-bit", 0xA7, 4096, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 4GiB 3,3V 8-bit", 0xD7, 4096, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 4GiB 1,8V 16-bit", 0xB7, 4096, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 4GiB 3,3V 16-bit", 0xC7, 4096, LP_OPTIONS16), /* 64 Gigabit */ - {"NAND 8GiB 1,8V 8-bit", 0xAE, 0, 8192, 0, LP_OPTIONS}, - {"NAND 8GiB 3,3V 8-bit", 0xDE, 0, 8192, 0, LP_OPTIONS}, - {"NAND 8GiB 1,8V 16-bit", 0xBE, 0, 8192, 0, LP_OPTIONS16}, - {"NAND 8GiB 3,3V 16-bit", 0xCE, 0, 8192, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 8GiB 1,8V 8-bit", 0xAE, 8192, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 8GiB 3,3V 8-bit", 0xDE, 8192, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 8GiB 1,8V 16-bit", 0xBE, 8192, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 8GiB 3,3V 16-bit", 0xCE, 8192, LP_OPTIONS16), /* 128 Gigabit */ - {"NAND 16GiB 1,8V 8-bit", 0x1A, 0, 16384, 0, LP_OPTIONS}, - {"NAND 16GiB 3,3V 8-bit", 0x3A, 0, 16384, 0, LP_OPTIONS}, - {"NAND 16GiB 1,8V 16-bit", 0x2A, 0, 16384, 0, LP_OPTIONS16}, - {"NAND 16GiB 3,3V 16-bit", 0x4A, 0, 16384, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 16GiB 1,8V 8-bit", 0x1A, 16384, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 16GiB 3,3V 8-bit", 0x3A, 16384, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 16GiB 1,8V 16-bit", 0x2A, 16384, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 16GiB 3,3V 16-bit", 0x4A, 16384, LP_OPTIONS16), /* 256 Gigabit */ - {"NAND 32GiB 1,8V 8-bit", 0x1C, 0, 32768, 0, LP_OPTIONS}, - {"NAND 32GiB 3,3V 8-bit", 0x3C, 0, 32768, 0, LP_OPTIONS}, - {"NAND 32GiB 1,8V 16-bit", 0x2C, 0, 32768, 0, LP_OPTIONS16}, - {"NAND 32GiB 3,3V 16-bit", 0x4C, 0, 32768, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 32GiB 1,8V 8-bit", 0x1C, 32768, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 32GiB 3,3V 8-bit", 0x3C, 32768, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 32GiB 1,8V 16-bit", 0x2C, 32768, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 32GiB 3,3V 16-bit", 0x4C, 32768, LP_OPTIONS16), /* 512 Gigabit */ - {"NAND 64GiB 1,8V 8-bit", 0x1E, 0, 65536, 0, LP_OPTIONS}, - {"NAND 64GiB 3,3V 8-bit", 0x3E, 0, 65536, 0, LP_OPTIONS}, - {"NAND 64GiB 1,8V 16-bit", 0x2E, 0, 65536, 0, LP_OPTIONS16}, - {"NAND 64GiB 3,3V 16-bit", 0x4E, 0, 65536, 0, LP_OPTIONS16}, + EXTENDED_ID_NAND("NAND 64GiB 1,8V 8-bit", 0x1E, 65536, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 64GiB 3,3V 8-bit", 0x3E, 65536, LP_OPTIONS), + EXTENDED_ID_NAND("NAND 64GiB 1,8V 16-bit", 0x2E, 65536, LP_OPTIONS16), + EXTENDED_ID_NAND("NAND 64GiB 3,3V 16-bit", 0x4E, 65536, LP_OPTIONS16), - /* - * Renesas AND 1 Gigabit. Those chips do not support extended id and - * have a strange page/block layout ! The chosen minimum erasesize is - * 4 * 2 * 2048 = 16384 Byte, as those chips have an array of 4 page - * planes 1 block = 2 pages, but due to plane arrangement the blocks - * 0-3 consists of page 0 + 4,1 + 5, 2 + 6, 3 + 7 Anyway JFFS2 would - * increase the eraseblock size so we chose a combined one which can be - * erased in one go There are more speed improvements for reads and - * writes possible, but not implemented now - */ - {"AND 128MiB 3,3V 8-bit", 0x01, 2048, 128, 0x4000, - NAND_IS_AND | NAND_4PAGE_ARRAY | BBT_AUTO_REFRESH}, - - {NULL,} + {NULL} }; -/* -* Manufacturer ID list -*/ +/* Manufacturer IDs */ struct nand_manufacturers nand_manuf_ids[] = { {NAND_MFR_TOSHIBA, "Toshiba"}, {NAND_MFR_SAMSUNG, "Samsung"}, diff --git a/drivers/mtd/nand/nandsim.c b/drivers/mtd/nand/nandsim.c index 891c52a30e6a..cb38f3d94218 100644 --- a/drivers/mtd/nand/nandsim.c +++ b/drivers/mtd/nand/nandsim.c @@ -218,7 +218,6 @@ MODULE_PARM_DESC(bch, "Enable BCH ecc and set how many bits should " #define STATE_CMD_READOOB 0x00000005 /* read OOB area */ #define STATE_CMD_ERASE1 0x00000006 /* sector erase first command */ #define STATE_CMD_STATUS 0x00000007 /* read status */ -#define STATE_CMD_STATUS_M 0x00000008 /* read multi-plane status (isn't implemented) */ #define STATE_CMD_SEQIN 0x00000009 /* sequential data input */ #define STATE_CMD_READID 0x0000000A /* read ID */ #define STATE_CMD_ERASE2 0x0000000B /* sector erase second command */ @@ -263,14 +262,13 @@ MODULE_PARM_DESC(bch, "Enable BCH ecc and set how many bits should " #define NS_OPER_STATES 6 /* Maximum number of states in operation */ #define OPT_ANY 0xFFFFFFFF /* any chip supports this operation */ -#define OPT_PAGE256 0x00000001 /* 256-byte page chips */ #define OPT_PAGE512 0x00000002 /* 512-byte page chips */ #define OPT_PAGE2048 0x00000008 /* 2048-byte page chips */ #define OPT_SMARTMEDIA 0x00000010 /* SmartMedia technology chips */ #define OPT_PAGE512_8BIT 0x00000040 /* 512-byte page chips with 8-bit bus width */ #define OPT_PAGE4096 0x00000080 /* 4096-byte page chips */ #define OPT_LARGEPAGE (OPT_PAGE2048 | OPT_PAGE4096) /* 2048 & 4096-byte page chips */ -#define OPT_SMALLPAGE (OPT_PAGE256 | OPT_PAGE512) /* 256 and 512-byte page chips */ +#define OPT_SMALLPAGE (OPT_PAGE512) /* 512-byte page chips */ /* Remove action bits from state */ #define NS_STATE(x) ((x) & ~ACTION_MASK) @@ -406,8 +404,6 @@ static struct nandsim_operations { {OPT_ANY, {STATE_CMD_ERASE1, STATE_ADDR_SEC, STATE_CMD_ERASE2 | ACTION_SECERASE, STATE_READY}}, /* Read status */ {OPT_ANY, {STATE_CMD_STATUS, STATE_DATAOUT_STATUS, STATE_READY}}, - /* Read multi-plane status */ - {OPT_SMARTMEDIA, {STATE_CMD_STATUS_M, STATE_DATAOUT_STATUS_M, STATE_READY}}, /* Read ID */ {OPT_ANY, {STATE_CMD_READID, STATE_ADDR_ZERO, STATE_DATAOUT_ID, STATE_READY}}, /* Large page devices read page */ @@ -699,10 +695,7 @@ static int init_nandsim(struct mtd_info *mtd) ns->geom.secszoob = ns->geom.secsz + ns->geom.oobsz * ns->geom.pgsec; ns->options = 0; - if (ns->geom.pgsz == 256) { - ns->options |= OPT_PAGE256; - } - else if (ns->geom.pgsz == 512) { + if (ns->geom.pgsz == 512) { ns->options |= OPT_PAGE512; if (ns->busw == 8) ns->options |= OPT_PAGE512_8BIT; @@ -769,9 +762,9 @@ static int init_nandsim(struct mtd_info *mtd) } /* Detect how many ID bytes the NAND chip outputs */ - for (i = 0; nand_flash_ids[i].name != NULL; i++) { - if (second_id_byte != nand_flash_ids[i].id) - continue; + for (i = 0; nand_flash_ids[i].name != NULL; i++) { + if (second_id_byte != nand_flash_ids[i].dev_id) + continue; } if (ns->busw == 16) @@ -1079,8 +1072,6 @@ static char *get_state_name(uint32_t state) return "STATE_CMD_ERASE1"; case STATE_CMD_STATUS: return "STATE_CMD_STATUS"; - case STATE_CMD_STATUS_M: - return "STATE_CMD_STATUS_M"; case STATE_CMD_SEQIN: return "STATE_CMD_SEQIN"; case STATE_CMD_READID: @@ -1145,7 +1136,6 @@ static int check_command(int cmd) case NAND_CMD_RNDOUTSTART: return 0; - case NAND_CMD_STATUS_MULTI: default: return 1; } @@ -1171,8 +1161,6 @@ static uint32_t get_state_by_command(unsigned command) return STATE_CMD_ERASE1; case NAND_CMD_STATUS: return STATE_CMD_STATUS; - case NAND_CMD_STATUS_MULTI: - return STATE_CMD_STATUS_M; case NAND_CMD_SEQIN: return STATE_CMD_SEQIN; case NAND_CMD_READID: @@ -2306,7 +2294,7 @@ static int __init ns_init_module(void) nand->geom.idbytes = 2; nand->regs.status = NS_STATUS_OK(nand); nand->nxstate = STATE_UNKNOWN; - nand->options |= OPT_PAGE256; /* temporary value */ + nand->options |= OPT_PAGE512; /* temporary value */ nand->ids[0] = first_id_byte; nand->ids[1] = second_id_byte; nand->ids[2] = third_id_byte; diff --git a/drivers/mtd/nand/nuc900_nand.c b/drivers/mtd/nand/nuc900_nand.c index a6191198d259..cd6be2ed53a8 100644 --- a/drivers/mtd/nand/nuc900_nand.c +++ b/drivers/mtd/nand/nuc900_nand.c @@ -177,15 +177,6 @@ static void nuc900_nand_command_lp(struct mtd_info *mtd, unsigned int command, case NAND_CMD_SEQIN: case NAND_CMD_RNDIN: case NAND_CMD_STATUS: - case NAND_CMD_DEPLETE1: - return; - - case NAND_CMD_STATUS_ERROR: - case NAND_CMD_STATUS_ERROR0: - case NAND_CMD_STATUS_ERROR1: - case NAND_CMD_STATUS_ERROR2: - case NAND_CMD_STATUS_ERROR3: - udelay(chip->chip_delay); return; case NAND_CMD_RESET: diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c index 8e820ddf4e08..81b80af55872 100644 --- a/drivers/mtd/nand/omap2.c +++ b/drivers/mtd/nand/omap2.c @@ -1023,9 +1023,9 @@ static int omap_wait(struct mtd_info *mtd, struct nand_chip *chip) int status, state = this->state; if (state == FL_ERASING) - timeo += (HZ * 400) / 1000; + timeo += msecs_to_jiffies(400); else - timeo += (HZ * 20) / 1000; + timeo += msecs_to_jiffies(20); writeb(NAND_CMD_STATUS & 0xFF, info->reg.gpmc_nand_command); while (time_before(jiffies, timeo)) { @@ -1701,8 +1701,9 @@ static int omap3_init_bch(struct mtd_info *mtd, int ecc_opt) elm_node = of_find_node_by_phandle(be32_to_cpup(parp)); pdev = of_find_device_by_node(elm_node); info->elm_dev = &pdev->dev; - elm_config(info->elm_dev, bch_type); - info->is_elm_used = true; + + if (elm_config(info->elm_dev, bch_type) == 0) + info->is_elm_used = true; } if (info->is_elm_used && (mtd->writesize <= 4096)) { diff --git a/drivers/mtd/nand/orion_nand.c b/drivers/mtd/nand/orion_nand.c index cd72b9299f6b..8fbd00208610 100644 --- a/drivers/mtd/nand/orion_nand.c +++ b/drivers/mtd/nand/orion_nand.c @@ -231,18 +231,7 @@ static struct platform_driver orion_nand_driver = { }, }; -static int __init orion_nand_init(void) -{ - return platform_driver_probe(&orion_nand_driver, orion_nand_probe); -} - -static void __exit orion_nand_exit(void) -{ - platform_driver_unregister(&orion_nand_driver); -} - -module_init(orion_nand_init); -module_exit(orion_nand_exit); +module_platform_driver_probe(orion_nand_driver, orion_nand_probe); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Tzachi Perelstein"); diff --git a/drivers/mtd/nand/ppchameleonevb.c b/drivers/mtd/nand/ppchameleonevb.c deleted file mode 100644 index 0ddd90e5788f..000000000000 --- a/drivers/mtd/nand/ppchameleonevb.c +++ /dev/null @@ -1,403 +0,0 @@ -/* - * drivers/mtd/nand/ppchameleonevb.c - * - * Copyright (C) 2003 DAVE Srl (info@wawnet.biz) - * - * Derived from drivers/mtd/nand/edb7312.c - * - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * Overview: - * This is a device driver for the NAND flash devices found on the - * PPChameleon/PPChameleonEVB system. - * PPChameleon options (autodetected): - * - BA model: no NAND - * - ME model: 32MB (Samsung K9F5608U0B) - * - HI model: 128MB (Samsung K9F1G08UOM) - * PPChameleonEVB options: - * - 32MB (Samsung K9F5608U0B) - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -#undef USE_READY_BUSY_PIN -#define USE_READY_BUSY_PIN -/* see datasheets (tR) */ -#define NAND_BIG_DELAY_US 25 -#define NAND_SMALL_DELAY_US 10 - -/* handy sizes */ -#define SZ_4M 0x00400000 -#define NAND_SMALL_SIZE 0x02000000 -#define NAND_MTD_NAME "ppchameleon-nand" -#define NAND_EVB_MTD_NAME "ppchameleonevb-nand" - -/* GPIO pins used to drive NAND chip mounted on processor module */ -#define NAND_nCE_GPIO_PIN (0x80000000 >> 1) -#define NAND_CLE_GPIO_PIN (0x80000000 >> 2) -#define NAND_ALE_GPIO_PIN (0x80000000 >> 3) -#define NAND_RB_GPIO_PIN (0x80000000 >> 4) -/* GPIO pins used to drive NAND chip mounted on EVB */ -#define NAND_EVB_nCE_GPIO_PIN (0x80000000 >> 14) -#define NAND_EVB_CLE_GPIO_PIN (0x80000000 >> 15) -#define NAND_EVB_ALE_GPIO_PIN (0x80000000 >> 16) -#define NAND_EVB_RB_GPIO_PIN (0x80000000 >> 31) - -/* - * MTD structure for PPChameleonEVB board - */ -static struct mtd_info *ppchameleon_mtd = NULL; -static struct mtd_info *ppchameleonevb_mtd = NULL; - -/* - * Module stuff - */ -static unsigned long ppchameleon_fio_pbase = CFG_NAND0_PADDR; -static unsigned long ppchameleonevb_fio_pbase = CFG_NAND1_PADDR; - -#ifdef MODULE -module_param(ppchameleon_fio_pbase, ulong, 0); -module_param(ppchameleonevb_fio_pbase, ulong, 0); -#else -__setup("ppchameleon_fio_pbase=", ppchameleon_fio_pbase); -__setup("ppchameleonevb_fio_pbase=", ppchameleonevb_fio_pbase); -#endif - -/* - * Define static partitions for flash devices - */ -static struct mtd_partition partition_info_hi[] = { - { .name = "PPChameleon HI Nand Flash", - .offset = 0, - .size = 128 * 1024 * 1024 - } -}; - -static struct mtd_partition partition_info_me[] = { - { .name = "PPChameleon ME Nand Flash", - .offset = 0, - .size = 32 * 1024 * 1024 - } -}; - -static struct mtd_partition partition_info_evb[] = { - { .name = "PPChameleonEVB Nand Flash", - .offset = 0, - .size = 32 * 1024 * 1024 - } -}; - -#define NUM_PARTITIONS 1 - -/* - * hardware specific access to control-lines - */ -static void ppchameleon_hwcontrol(struct mtd_info *mtdinfo, int cmd, - unsigned int ctrl) -{ - struct nand_chip *chip = mtd->priv; - - if (ctrl & NAND_CTRL_CHANGE) { -#error Missing headerfiles. No way to fix this. -tglx - switch (cmd) { - case NAND_CTL_SETCLE: - MACRO_NAND_CTL_SETCLE((unsigned long)CFG_NAND0_PADDR); - break; - case NAND_CTL_CLRCLE: - MACRO_NAND_CTL_CLRCLE((unsigned long)CFG_NAND0_PADDR); - break; - case NAND_CTL_SETALE: - MACRO_NAND_CTL_SETALE((unsigned long)CFG_NAND0_PADDR); - break; - case NAND_CTL_CLRALE: - MACRO_NAND_CTL_CLRALE((unsigned long)CFG_NAND0_PADDR); - break; - case NAND_CTL_SETNCE: - MACRO_NAND_ENABLE_CE((unsigned long)CFG_NAND0_PADDR); - break; - case NAND_CTL_CLRNCE: - MACRO_NAND_DISABLE_CE((unsigned long)CFG_NAND0_PADDR); - break; - } - } - if (cmd != NAND_CMD_NONE) - writeb(cmd, chip->IO_ADDR_W); -} - -static void ppchameleonevb_hwcontrol(struct mtd_info *mtdinfo, int cmd, - unsigned int ctrl) -{ - struct nand_chip *chip = mtd->priv; - - if (ctrl & NAND_CTRL_CHANGE) { -#error Missing headerfiles. No way to fix this. -tglx - switch (cmd) { - case NAND_CTL_SETCLE: - MACRO_NAND_CTL_SETCLE((unsigned long)CFG_NAND1_PADDR); - break; - case NAND_CTL_CLRCLE: - MACRO_NAND_CTL_CLRCLE((unsigned long)CFG_NAND1_PADDR); - break; - case NAND_CTL_SETALE: - MACRO_NAND_CTL_SETALE((unsigned long)CFG_NAND1_PADDR); - break; - case NAND_CTL_CLRALE: - MACRO_NAND_CTL_CLRALE((unsigned long)CFG_NAND1_PADDR); - break; - case NAND_CTL_SETNCE: - MACRO_NAND_ENABLE_CE((unsigned long)CFG_NAND1_PADDR); - break; - case NAND_CTL_CLRNCE: - MACRO_NAND_DISABLE_CE((unsigned long)CFG_NAND1_PADDR); - break; - } - } - if (cmd != NAND_CMD_NONE) - writeb(cmd, chip->IO_ADDR_W); -} - -#ifdef USE_READY_BUSY_PIN -/* - * read device ready pin - */ -static int ppchameleon_device_ready(struct mtd_info *minfo) -{ - if (in_be32((volatile unsigned *)GPIO0_IR) & NAND_RB_GPIO_PIN) - return 1; - return 0; -} - -static int ppchameleonevb_device_ready(struct mtd_info *minfo) -{ - if (in_be32((volatile unsigned *)GPIO0_IR) & NAND_EVB_RB_GPIO_PIN) - return 1; - return 0; -} -#endif - -/* - * Main initialization routine - */ -static int __init ppchameleonevb_init(void) -{ - struct nand_chip *this; - void __iomem *ppchameleon_fio_base; - void __iomem *ppchameleonevb_fio_base; - - /********************************* - * Processor module NAND (if any) * - *********************************/ - /* Allocate memory for MTD device structure and private data */ - ppchameleon_mtd = kmalloc(sizeof(struct mtd_info) + sizeof(struct nand_chip), GFP_KERNEL); - if (!ppchameleon_mtd) { - printk("Unable to allocate PPChameleon NAND MTD device structure.\n"); - return -ENOMEM; - } - - /* map physical address */ - ppchameleon_fio_base = ioremap(ppchameleon_fio_pbase, SZ_4M); - if (!ppchameleon_fio_base) { - printk("ioremap PPChameleon NAND flash failed\n"); - kfree(ppchameleon_mtd); - return -EIO; - } - - /* Get pointer to private data */ - this = (struct nand_chip *)(&ppchameleon_mtd[1]); - - /* Initialize structures */ - memset(ppchameleon_mtd, 0, sizeof(struct mtd_info)); - memset(this, 0, sizeof(struct nand_chip)); - - /* Link the private data with the MTD structure */ - ppchameleon_mtd->priv = this; - ppchameleon_mtd->owner = THIS_MODULE; - - /* Initialize GPIOs */ - /* Pin mapping for NAND chip */ - /* - CE GPIO_01 - CLE GPIO_02 - ALE GPIO_03 - R/B GPIO_04 - */ - /* output select */ - out_be32((volatile unsigned *)GPIO0_OSRH, in_be32((volatile unsigned *)GPIO0_OSRH) & 0xC0FFFFFF); - /* three-state select */ - out_be32((volatile unsigned *)GPIO0_TSRH, in_be32((volatile unsigned *)GPIO0_TSRH) & 0xC0FFFFFF); - /* enable output driver */ - out_be32((volatile unsigned *)GPIO0_TCR, - in_be32((volatile unsigned *)GPIO0_TCR) | NAND_nCE_GPIO_PIN | NAND_CLE_GPIO_PIN | NAND_ALE_GPIO_PIN); -#ifdef USE_READY_BUSY_PIN - /* three-state select */ - out_be32((volatile unsigned *)GPIO0_TSRH, in_be32((volatile unsigned *)GPIO0_TSRH) & 0xFF3FFFFF); - /* high-impedecence */ - out_be32((volatile unsigned *)GPIO0_TCR, in_be32((volatile unsigned *)GPIO0_TCR) & (~NAND_RB_GPIO_PIN)); - /* input select */ - out_be32((volatile unsigned *)GPIO0_ISR1H, - (in_be32((volatile unsigned *)GPIO0_ISR1H) & 0xFF3FFFFF) | 0x00400000); -#endif - - /* insert callbacks */ - this->IO_ADDR_R = ppchameleon_fio_base; - this->IO_ADDR_W = ppchameleon_fio_base; - this->cmd_ctrl = ppchameleon_hwcontrol; -#ifdef USE_READY_BUSY_PIN - this->dev_ready = ppchameleon_device_ready; -#endif - this->chip_delay = NAND_BIG_DELAY_US; - /* ECC mode */ - this->ecc.mode = NAND_ECC_SOFT; - - /* Scan to find existence of the device (it could not be mounted) */ - if (nand_scan(ppchameleon_mtd, 1)) { - iounmap((void *)ppchameleon_fio_base); - ppchameleon_fio_base = NULL; - kfree(ppchameleon_mtd); - goto nand_evb_init; - } -#ifndef USE_READY_BUSY_PIN - /* Adjust delay if necessary */ - if (ppchameleon_mtd->size == NAND_SMALL_SIZE) - this->chip_delay = NAND_SMALL_DELAY_US; -#endif - - ppchameleon_mtd->name = "ppchameleon-nand"; - - /* Register the partitions */ - mtd_device_parse_register(ppchameleon_mtd, NULL, NULL, - ppchameleon_mtd->size == NAND_SMALL_SIZE ? - partition_info_me : partition_info_hi, - NUM_PARTITIONS); - - nand_evb_init: - /**************************** - * EVB NAND (always present) * - ****************************/ - /* Allocate memory for MTD device structure and private data */ - ppchameleonevb_mtd = kmalloc(sizeof(struct mtd_info) + sizeof(struct nand_chip), GFP_KERNEL); - if (!ppchameleonevb_mtd) { - printk("Unable to allocate PPChameleonEVB NAND MTD device structure.\n"); - if (ppchameleon_fio_base) - iounmap(ppchameleon_fio_base); - return -ENOMEM; - } - - /* map physical address */ - ppchameleonevb_fio_base = ioremap(ppchameleonevb_fio_pbase, SZ_4M); - if (!ppchameleonevb_fio_base) { - printk("ioremap PPChameleonEVB NAND flash failed\n"); - kfree(ppchameleonevb_mtd); - if (ppchameleon_fio_base) - iounmap(ppchameleon_fio_base); - return -EIO; - } - - /* Get pointer to private data */ - this = (struct nand_chip *)(&ppchameleonevb_mtd[1]); - - /* Initialize structures */ - memset(ppchameleonevb_mtd, 0, sizeof(struct mtd_info)); - memset(this, 0, sizeof(struct nand_chip)); - - /* Link the private data with the MTD structure */ - ppchameleonevb_mtd->priv = this; - - /* Initialize GPIOs */ - /* Pin mapping for NAND chip */ - /* - CE GPIO_14 - CLE GPIO_15 - ALE GPIO_16 - R/B GPIO_31 - */ - /* output select */ - out_be32((volatile unsigned *)GPIO0_OSRH, in_be32((volatile unsigned *)GPIO0_OSRH) & 0xFFFFFFF0); - out_be32((volatile unsigned *)GPIO0_OSRL, in_be32((volatile unsigned *)GPIO0_OSRL) & 0x3FFFFFFF); - /* three-state select */ - out_be32((volatile unsigned *)GPIO0_TSRH, in_be32((volatile unsigned *)GPIO0_TSRH) & 0xFFFFFFF0); - out_be32((volatile unsigned *)GPIO0_TSRL, in_be32((volatile unsigned *)GPIO0_TSRL) & 0x3FFFFFFF); - /* enable output driver */ - out_be32((volatile unsigned *)GPIO0_TCR, in_be32((volatile unsigned *)GPIO0_TCR) | NAND_EVB_nCE_GPIO_PIN | - NAND_EVB_CLE_GPIO_PIN | NAND_EVB_ALE_GPIO_PIN); -#ifdef USE_READY_BUSY_PIN - /* three-state select */ - out_be32((volatile unsigned *)GPIO0_TSRL, in_be32((volatile unsigned *)GPIO0_TSRL) & 0xFFFFFFFC); - /* high-impedecence */ - out_be32((volatile unsigned *)GPIO0_TCR, in_be32((volatile unsigned *)GPIO0_TCR) & (~NAND_EVB_RB_GPIO_PIN)); - /* input select */ - out_be32((volatile unsigned *)GPIO0_ISR1L, - (in_be32((volatile unsigned *)GPIO0_ISR1L) & 0xFFFFFFFC) | 0x00000001); -#endif - - /* insert callbacks */ - this->IO_ADDR_R = ppchameleonevb_fio_base; - this->IO_ADDR_W = ppchameleonevb_fio_base; - this->cmd_ctrl = ppchameleonevb_hwcontrol; -#ifdef USE_READY_BUSY_PIN - this->dev_ready = ppchameleonevb_device_ready; -#endif - this->chip_delay = NAND_SMALL_DELAY_US; - - /* ECC mode */ - this->ecc.mode = NAND_ECC_SOFT; - - /* Scan to find existence of the device */ - if (nand_scan(ppchameleonevb_mtd, 1)) { - iounmap((void *)ppchameleonevb_fio_base); - kfree(ppchameleonevb_mtd); - if (ppchameleon_fio_base) - iounmap(ppchameleon_fio_base); - return -ENXIO; - } - - ppchameleonevb_mtd->name = NAND_EVB_MTD_NAME; - - /* Register the partitions */ - mtd_device_parse_register(ppchameleonevb_mtd, NULL, NULL, - ppchameleon_mtd->size == NAND_SMALL_SIZE ? - partition_info_me : partition_info_hi, - NUM_PARTITIONS); - - /* Return happy */ - return 0; -} - -module_init(ppchameleonevb_init); - -/* - * Clean up routine - */ -static void __exit ppchameleonevb_cleanup(void) -{ - struct nand_chip *this; - - /* Release resources, unregister device(s) */ - nand_release(ppchameleon_mtd); - nand_release(ppchameleonevb_mtd); - - /* Release iomaps */ - this = (struct nand_chip *) &ppchameleon_mtd[1]; - iounmap((void *) this->IO_ADDR_R); - this = (struct nand_chip *) &ppchameleonevb_mtd[1]; - iounmap((void *) this->IO_ADDR_R); - - /* Free the MTD device structure */ - kfree (ppchameleon_mtd); - kfree (ppchameleonevb_mtd); -} -module_exit(ppchameleonevb_cleanup); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("DAVE Srl "); -MODULE_DESCRIPTION("MTD map driver for DAVE Srl PPChameleonEVB board"); diff --git a/drivers/mtd/nand/pxa3xx_nand.c b/drivers/mtd/nand/pxa3xx_nand.c index 37ee75c7bacb..dec80ca6a5ce 100644 --- a/drivers/mtd/nand/pxa3xx_nand.c +++ b/drivers/mtd/nand/pxa3xx_nand.c @@ -989,7 +989,7 @@ static int pxa3xx_nand_scan(struct mtd_info *mtd) } pxa3xx_flash_ids[0].name = f->name; - pxa3xx_flash_ids[0].id = (f->chip_id >> 8) & 0xffff; + pxa3xx_flash_ids[0].dev_id = (f->chip_id >> 8) & 0xffff; pxa3xx_flash_ids[0].pagesize = f->page_size; chipsize = (uint64_t)f->num_blocks * f->page_per_block * f->page_size; pxa3xx_flash_ids[0].chipsize = chipsize >> 20; diff --git a/drivers/mtd/nand/rtc_from4.c b/drivers/mtd/nand/rtc_from4.c deleted file mode 100644 index e55b5cfbe145..000000000000 --- a/drivers/mtd/nand/rtc_from4.c +++ /dev/null @@ -1,624 +0,0 @@ -/* - * drivers/mtd/nand/rtc_from4.c - * - * Copyright (C) 2004 Red Hat, Inc. - * - * Derived from drivers/mtd/nand/spia.c - * Copyright (C) 2000 Steven J. Hill (sjhill@realitydiluted.com) - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * Overview: - * This is a device driver for the AG-AND flash device found on the - * Renesas Technology Corp. Flash ROM 4-slot interface board (FROM_BOARD4), - * which utilizes the Renesas HN29V1G91T-30 part. - * This chip is a 1 GBibit (128MiB x 8 bits) AG-AND flash device. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* - * MTD structure for Renesas board - */ -static struct mtd_info *rtc_from4_mtd = NULL; - -#define RTC_FROM4_MAX_CHIPS 2 - -/* HS77x9 processor register defines */ -#define SH77X9_BCR1 ((volatile unsigned short *)(0xFFFFFF60)) -#define SH77X9_BCR2 ((volatile unsigned short *)(0xFFFFFF62)) -#define SH77X9_WCR1 ((volatile unsigned short *)(0xFFFFFF64)) -#define SH77X9_WCR2 ((volatile unsigned short *)(0xFFFFFF66)) -#define SH77X9_MCR ((volatile unsigned short *)(0xFFFFFF68)) -#define SH77X9_PCR ((volatile unsigned short *)(0xFFFFFF6C)) -#define SH77X9_FRQCR ((volatile unsigned short *)(0xFFFFFF80)) - -/* - * Values specific to the Renesas Technology Corp. FROM_BOARD4 (used with HS77x9 processor) - */ -/* Address where flash is mapped */ -#define RTC_FROM4_FIO_BASE 0x14000000 - -/* CLE and ALE are tied to address lines 5 & 4, respectively */ -#define RTC_FROM4_CLE (1 << 5) -#define RTC_FROM4_ALE (1 << 4) - -/* address lines A24-A22 used for chip selection */ -#define RTC_FROM4_NAND_ADDR_SLOT3 (0x00800000) -#define RTC_FROM4_NAND_ADDR_SLOT4 (0x00C00000) -#define RTC_FROM4_NAND_ADDR_FPGA (0x01000000) -/* mask address lines A24-A22 used for chip selection */ -#define RTC_FROM4_NAND_ADDR_MASK (RTC_FROM4_NAND_ADDR_SLOT3 | RTC_FROM4_NAND_ADDR_SLOT4 | RTC_FROM4_NAND_ADDR_FPGA) - -/* FPGA status register for checking device ready (bit zero) */ -#define RTC_FROM4_FPGA_SR (RTC_FROM4_NAND_ADDR_FPGA | 0x00000002) -#define RTC_FROM4_DEVICE_READY 0x0001 - -/* FPGA Reed-Solomon ECC Control register */ - -#define RTC_FROM4_RS_ECC_CTL (RTC_FROM4_NAND_ADDR_FPGA | 0x00000050) -#define RTC_FROM4_RS_ECC_CTL_CLR (1 << 7) -#define RTC_FROM4_RS_ECC_CTL_GEN (1 << 6) -#define RTC_FROM4_RS_ECC_CTL_FD_E (1 << 5) - -/* FPGA Reed-Solomon ECC code base */ -#define RTC_FROM4_RS_ECC (RTC_FROM4_NAND_ADDR_FPGA | 0x00000060) -#define RTC_FROM4_RS_ECCN (RTC_FROM4_NAND_ADDR_FPGA | 0x00000080) - -/* FPGA Reed-Solomon ECC check register */ -#define RTC_FROM4_RS_ECC_CHK (RTC_FROM4_NAND_ADDR_FPGA | 0x00000070) -#define RTC_FROM4_RS_ECC_CHK_ERROR (1 << 7) - -#define ERR_STAT_ECC_AVAILABLE 0x20 - -/* Undefine for software ECC */ -#define RTC_FROM4_HWECC 1 - -/* Define as 1 for no virtual erase blocks (in JFFS2) */ -#define RTC_FROM4_NO_VIRTBLOCKS 0 - -/* - * Module stuff - */ -static void __iomem *rtc_from4_fio_base = (void *)P2SEGADDR(RTC_FROM4_FIO_BASE); - -static const struct mtd_partition partition_info[] = { - { - .name = "Renesas flash partition 1", - .offset = 0, - .size = MTDPART_SIZ_FULL}, -}; - -#define NUM_PARTITIONS 1 - -/* - * hardware specific flash bbt decriptors - * Note: this is to allow debugging by disabling - * NAND_BBT_CREATE and/or NAND_BBT_WRITE - * - */ -static uint8_t bbt_pattern[] = { 'B', 'b', 't', '0' }; -static uint8_t mirror_pattern[] = { '1', 't', 'b', 'B' }; - -static struct nand_bbt_descr rtc_from4_bbt_main_descr = { - .options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE - | NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP, - .offs = 40, - .len = 4, - .veroffs = 44, - .maxblocks = 4, - .pattern = bbt_pattern -}; - -static struct nand_bbt_descr rtc_from4_bbt_mirror_descr = { - .options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE - | NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP, - .offs = 40, - .len = 4, - .veroffs = 44, - .maxblocks = 4, - .pattern = mirror_pattern -}; - -#ifdef RTC_FROM4_HWECC - -/* the Reed Solomon control structure */ -static struct rs_control *rs_decoder; - -/* - * hardware specific Out Of Band information - */ -static struct nand_ecclayout rtc_from4_nand_oobinfo = { - .eccbytes = 32, - .eccpos = { - 0, 1, 2, 3, 4, 5, 6, 7, - 8, 9, 10, 11, 12, 13, 14, 15, - 16, 17, 18, 19, 20, 21, 22, 23, - 24, 25, 26, 27, 28, 29, 30, 31}, - .oobfree = {{32, 32}} -}; - -#endif - -/* - * rtc_from4_hwcontrol - hardware specific access to control-lines - * @mtd: MTD device structure - * @cmd: hardware control command - * - * Address lines (A5 and A4) are used to control Command and Address Latch - * Enable on this board, so set the read/write address appropriately. - * - * Chip Enable is also controlled by the Chip Select (CS5) and - * Address lines (A24-A22), so no action is required here. - * - */ -static void rtc_from4_hwcontrol(struct mtd_info *mtd, int cmd, - unsigned int ctrl) -{ - struct nand_chip *chip = (mtd->priv); - - if (cmd == NAND_CMD_NONE) - return; - - if (ctrl & NAND_CLE) - writeb(cmd, chip->IO_ADDR_W | RTC_FROM4_CLE); - else - writeb(cmd, chip->IO_ADDR_W | RTC_FROM4_ALE); -} - -/* - * rtc_from4_nand_select_chip - hardware specific chip select - * @mtd: MTD device structure - * @chip: Chip to select (0 == slot 3, 1 == slot 4) - * - * The chip select is based on address lines A24-A22. - * This driver uses flash slots 3 and 4 (A23-A22). - * - */ -static void rtc_from4_nand_select_chip(struct mtd_info *mtd, int chip) -{ - struct nand_chip *this = mtd->priv; - - this->IO_ADDR_R = (void __iomem *)((unsigned long)this->IO_ADDR_R & ~RTC_FROM4_NAND_ADDR_MASK); - this->IO_ADDR_W = (void __iomem *)((unsigned long)this->IO_ADDR_W & ~RTC_FROM4_NAND_ADDR_MASK); - - switch (chip) { - - case 0: /* select slot 3 chip */ - this->IO_ADDR_R = (void __iomem *)((unsigned long)this->IO_ADDR_R | RTC_FROM4_NAND_ADDR_SLOT3); - this->IO_ADDR_W = (void __iomem *)((unsigned long)this->IO_ADDR_W | RTC_FROM4_NAND_ADDR_SLOT3); - break; - case 1: /* select slot 4 chip */ - this->IO_ADDR_R = (void __iomem *)((unsigned long)this->IO_ADDR_R | RTC_FROM4_NAND_ADDR_SLOT4); - this->IO_ADDR_W = (void __iomem *)((unsigned long)this->IO_ADDR_W | RTC_FROM4_NAND_ADDR_SLOT4); - break; - - } -} - -/* - * rtc_from4_nand_device_ready - hardware specific ready/busy check - * @mtd: MTD device structure - * - * This board provides the Ready/Busy state in the status register - * of the FPGA. Bit zero indicates the RDY(1)/BSY(0) signal. - * - */ -static int rtc_from4_nand_device_ready(struct mtd_info *mtd) -{ - unsigned short status; - - status = *((volatile unsigned short *)(rtc_from4_fio_base + RTC_FROM4_FPGA_SR)); - - return (status & RTC_FROM4_DEVICE_READY); - -} - -/* - * deplete - code to perform device recovery in case there was a power loss - * @mtd: MTD device structure - * @chip: Chip to select (0 == slot 3, 1 == slot 4) - * - * If there was a sudden loss of power during an erase operation, a - * "device recovery" operation must be performed when power is restored - * to ensure correct operation. This routine performs the required steps - * for the requested chip. - * - * See page 86 of the data sheet for details. - * - */ -static void deplete(struct mtd_info *mtd, int chip) -{ - struct nand_chip *this = mtd->priv; - - /* wait until device is ready */ - while (!this->dev_ready(mtd)) ; - - this->select_chip(mtd, chip); - - /* Send the commands for device recovery, phase 1 */ - this->cmdfunc(mtd, NAND_CMD_DEPLETE1, 0x0000, 0x0000); - this->cmdfunc(mtd, NAND_CMD_DEPLETE2, -1, -1); - - /* Send the commands for device recovery, phase 2 */ - this->cmdfunc(mtd, NAND_CMD_DEPLETE1, 0x0000, 0x0004); - this->cmdfunc(mtd, NAND_CMD_DEPLETE2, -1, -1); - -} - -#ifdef RTC_FROM4_HWECC -/* - * rtc_from4_enable_hwecc - hardware specific hardware ECC enable function - * @mtd: MTD device structure - * @mode: I/O mode; read or write - * - * enable hardware ECC for data read or write - * - */ -static void rtc_from4_enable_hwecc(struct mtd_info *mtd, int mode) -{ - volatile unsigned short *rs_ecc_ctl = (volatile unsigned short *)(rtc_from4_fio_base + RTC_FROM4_RS_ECC_CTL); - unsigned short status; - - switch (mode) { - case NAND_ECC_READ: - status = RTC_FROM4_RS_ECC_CTL_CLR | RTC_FROM4_RS_ECC_CTL_FD_E; - - *rs_ecc_ctl = status; - break; - - case NAND_ECC_READSYN: - status = 0x00; - - *rs_ecc_ctl = status; - break; - - case NAND_ECC_WRITE: - status = RTC_FROM4_RS_ECC_CTL_CLR | RTC_FROM4_RS_ECC_CTL_GEN | RTC_FROM4_RS_ECC_CTL_FD_E; - - *rs_ecc_ctl = status; - break; - - default: - BUG(); - break; - } - -} - -/* - * rtc_from4_calculate_ecc - hardware specific code to read ECC code - * @mtd: MTD device structure - * @dat: buffer containing the data to generate ECC codes - * @ecc_code ECC codes calculated - * - * The ECC code is calculated by the FPGA. All we have to do is read the values - * from the FPGA registers. - * - * Note: We read from the inverted registers, since data is inverted before - * the code is calculated. So all 0xff data (blank page) results in all 0xff rs code - * - */ -static void rtc_from4_calculate_ecc(struct mtd_info *mtd, const u_char *dat, u_char *ecc_code) -{ - volatile unsigned short *rs_eccn = (volatile unsigned short *)(rtc_from4_fio_base + RTC_FROM4_RS_ECCN); - unsigned short value; - int i; - - for (i = 0; i < 8; i++) { - value = *rs_eccn; - ecc_code[i] = (unsigned char)value; - rs_eccn++; - } - ecc_code[7] |= 0x0f; /* set the last four bits (not used) */ -} - -/* - * rtc_from4_correct_data - hardware specific code to correct data using ECC code - * @mtd: MTD device structure - * @buf: buffer containing the data to generate ECC codes - * @ecc1 ECC codes read - * @ecc2 ECC codes calculated - * - * The FPGA tells us fast, if there's an error or not. If no, we go back happy - * else we read the ecc results from the fpga and call the rs library to decode - * and hopefully correct the error. - * - */ -static int rtc_from4_correct_data(struct mtd_info *mtd, const u_char *buf, u_char *ecc1, u_char *ecc2) -{ - int i, j, res; - unsigned short status; - uint16_t par[6], syn[6]; - uint8_t ecc[8]; - volatile unsigned short *rs_ecc; - - status = *((volatile unsigned short *)(rtc_from4_fio_base + RTC_FROM4_RS_ECC_CHK)); - - if (!(status & RTC_FROM4_RS_ECC_CHK_ERROR)) { - return 0; - } - - /* Read the syndrome pattern from the FPGA and correct the bitorder */ - rs_ecc = (volatile unsigned short *)(rtc_from4_fio_base + RTC_FROM4_RS_ECC); - for (i = 0; i < 8; i++) { - ecc[i] = bitrev8(*rs_ecc); - rs_ecc++; - } - - /* convert into 6 10bit syndrome fields */ - par[5] = rs_decoder->index_of[(((uint16_t) ecc[0] >> 0) & 0x0ff) | (((uint16_t) ecc[1] << 8) & 0x300)]; - par[4] = rs_decoder->index_of[(((uint16_t) ecc[1] >> 2) & 0x03f) | (((uint16_t) ecc[2] << 6) & 0x3c0)]; - par[3] = rs_decoder->index_of[(((uint16_t) ecc[2] >> 4) & 0x00f) | (((uint16_t) ecc[3] << 4) & 0x3f0)]; - par[2] = rs_decoder->index_of[(((uint16_t) ecc[3] >> 6) & 0x003) | (((uint16_t) ecc[4] << 2) & 0x3fc)]; - par[1] = rs_decoder->index_of[(((uint16_t) ecc[5] >> 0) & 0x0ff) | (((uint16_t) ecc[6] << 8) & 0x300)]; - par[0] = (((uint16_t) ecc[6] >> 2) & 0x03f) | (((uint16_t) ecc[7] << 6) & 0x3c0); - - /* Convert to computable syndrome */ - for (i = 0; i < 6; i++) { - syn[i] = par[0]; - for (j = 1; j < 6; j++) - if (par[j] != rs_decoder->nn) - syn[i] ^= rs_decoder->alpha_to[rs_modnn(rs_decoder, par[j] + i * j)]; - - /* Convert to index form */ - syn[i] = rs_decoder->index_of[syn[i]]; - } - - /* Let the library code do its magic. */ - res = decode_rs8(rs_decoder, (uint8_t *) buf, par, 512, syn, 0, NULL, 0xff, NULL); - if (res > 0) { - pr_debug("rtc_from4_correct_data: " "ECC corrected %d errors on read\n", res); - } - return res; -} - -/** - * rtc_from4_errstat - perform additional error status checks - * @mtd: MTD device structure - * @this: NAND chip structure - * @state: state or the operation - * @status: status code returned from read status - * @page: startpage inside the chip, must be called with (page & this->pagemask) - * - * Perform additional error status checks on erase and write failures - * to determine if errors are correctable. For this device, correctable - * 1-bit errors on erase and write are considered acceptable. - * - * note: see pages 34..37 of data sheet for details. - * - */ -static int rtc_from4_errstat(struct mtd_info *mtd, struct nand_chip *this, - int state, int status, int page) -{ - int er_stat = 0; - int rtn, retlen; - size_t len; - uint8_t *buf; - int i; - - this->cmdfunc(mtd, NAND_CMD_STATUS_CLEAR, -1, -1); - - if (state == FL_ERASING) { - - for (i = 0; i < 4; i++) { - if (!(status & 1 << (i + 1))) - continue; - this->cmdfunc(mtd, (NAND_CMD_STATUS_ERROR + i + 1), - -1, -1); - rtn = this->read_byte(mtd); - this->cmdfunc(mtd, NAND_CMD_STATUS_RESET, -1, -1); - - /* err_ecc_not_avail */ - if (!(rtn & ERR_STAT_ECC_AVAILABLE)) - er_stat |= 1 << (i + 1); - } - - } else if (state == FL_WRITING) { - - unsigned long corrected = mtd->ecc_stats.corrected; - - /* single bank write logic */ - this->cmdfunc(mtd, NAND_CMD_STATUS_ERROR, -1, -1); - rtn = this->read_byte(mtd); - this->cmdfunc(mtd, NAND_CMD_STATUS_RESET, -1, -1); - - if (!(rtn & ERR_STAT_ECC_AVAILABLE)) { - /* err_ecc_not_avail */ - er_stat |= 1 << 1; - goto out; - } - - len = mtd->writesize; - buf = kmalloc(len, GFP_KERNEL); - if (!buf) { - er_stat = 1; - goto out; - } - - /* recovery read */ - rtn = nand_do_read(mtd, page, len, &retlen, buf); - - /* if read failed or > 1-bit error corrected */ - if (rtn || (mtd->ecc_stats.corrected - corrected) > 1) - er_stat |= 1 << 1; - kfree(buf); - } -out: - rtn = status; - if (er_stat == 0) { /* if ECC is available */ - rtn = (status & ~NAND_STATUS_FAIL); /* clear the error bit */ - } - - return rtn; -} -#endif - -/* - * Main initialization routine - */ -static int __init rtc_from4_init(void) -{ - struct nand_chip *this; - unsigned short bcr1, bcr2, wcr2; - int i; - int ret; - - /* Allocate memory for MTD device structure and private data */ - rtc_from4_mtd = kmalloc(sizeof(struct mtd_info) + sizeof(struct nand_chip), GFP_KERNEL); - if (!rtc_from4_mtd) { - printk("Unable to allocate Renesas NAND MTD device structure.\n"); - return -ENOMEM; - } - - /* Get pointer to private data */ - this = (struct nand_chip *)(&rtc_from4_mtd[1]); - - /* Initialize structures */ - memset(rtc_from4_mtd, 0, sizeof(struct mtd_info)); - memset(this, 0, sizeof(struct nand_chip)); - - /* Link the private data with the MTD structure */ - rtc_from4_mtd->priv = this; - rtc_from4_mtd->owner = THIS_MODULE; - - /* set area 5 as PCMCIA mode to clear the spec of tDH(Data hold time;9ns min) */ - bcr1 = *SH77X9_BCR1 & ~0x0002; - bcr1 |= 0x0002; - *SH77X9_BCR1 = bcr1; - - /* set */ - bcr2 = *SH77X9_BCR2 & ~0x0c00; - bcr2 |= 0x0800; - *SH77X9_BCR2 = bcr2; - - /* set area 5 wait states */ - wcr2 = *SH77X9_WCR2 & ~0x1c00; - wcr2 |= 0x1c00; - *SH77X9_WCR2 = wcr2; - - /* Set address of NAND IO lines */ - this->IO_ADDR_R = rtc_from4_fio_base; - this->IO_ADDR_W = rtc_from4_fio_base; - /* Set address of hardware control function */ - this->cmd_ctrl = rtc_from4_hwcontrol; - /* Set address of chip select function */ - this->select_chip = rtc_from4_nand_select_chip; - /* command delay time (in us) */ - this->chip_delay = 100; - /* return the status of the Ready/Busy line */ - this->dev_ready = rtc_from4_nand_device_ready; - -#ifdef RTC_FROM4_HWECC - printk(KERN_INFO "rtc_from4_init: using hardware ECC detection.\n"); - - this->ecc.mode = NAND_ECC_HW_SYNDROME; - this->ecc.size = 512; - this->ecc.bytes = 8; - this->ecc.strength = 3; - /* return the status of extra status and ECC checks */ - this->errstat = rtc_from4_errstat; - /* set the nand_oobinfo to support FPGA H/W error detection */ - this->ecc.layout = &rtc_from4_nand_oobinfo; - this->ecc.hwctl = rtc_from4_enable_hwecc; - this->ecc.calculate = rtc_from4_calculate_ecc; - this->ecc.correct = rtc_from4_correct_data; - - /* We could create the decoder on demand, if memory is a concern. - * This way we have it handy, if an error happens - * - * Symbolsize is 10 (bits) - * Primitve polynomial is x^10+x^3+1 - * first consecutive root is 0 - * primitve element to generate roots = 1 - * generator polinomial degree = 6 - */ - rs_decoder = init_rs(10, 0x409, 0, 1, 6); - if (!rs_decoder) { - printk(KERN_ERR "Could not create a RS decoder\n"); - ret = -ENOMEM; - goto err_1; - } -#else - printk(KERN_INFO "rtc_from4_init: using software ECC detection.\n"); - - this->ecc.mode = NAND_ECC_SOFT; -#endif - - /* set the bad block tables to support debugging */ - this->bbt_td = &rtc_from4_bbt_main_descr; - this->bbt_md = &rtc_from4_bbt_mirror_descr; - - /* Scan to find existence of the device */ - if (nand_scan(rtc_from4_mtd, RTC_FROM4_MAX_CHIPS)) { - ret = -ENXIO; - goto err_2; - } - - /* Perform 'device recovery' for each chip in case there was a power loss. */ - for (i = 0; i < this->numchips; i++) { - deplete(rtc_from4_mtd, i); - } - -#if RTC_FROM4_NO_VIRTBLOCKS - /* use a smaller erase block to minimize wasted space when a block is bad */ - /* note: this uses eight times as much RAM as using the default and makes */ - /* mounts take four times as long. */ - rtc_from4_mtd->flags |= MTD_NO_VIRTBLOCKS; -#endif - - /* Register the partitions */ - ret = mtd_device_register(rtc_from4_mtd, partition_info, - NUM_PARTITIONS); - if (ret) - goto err_3; - - /* Return happy */ - return 0; -err_3: - nand_release(rtc_from4_mtd); -err_2: - free_rs(rs_decoder); -err_1: - kfree(rtc_from4_mtd); - return ret; -} - -module_init(rtc_from4_init); - -/* - * Clean up routine - */ -static void __exit rtc_from4_cleanup(void) -{ - /* Release resource, unregister partitions */ - nand_release(rtc_from4_mtd); - - /* Free the MTD device structure */ - kfree(rtc_from4_mtd); - -#ifdef RTC_FROM4_HWECC - /* Free the reed solomon resources */ - if (rs_decoder) { - free_rs(rs_decoder); - } -#endif -} - -module_exit(rtc_from4_cleanup); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("d.marlin #include #include +#include #include "sm_common.h" static struct nand_ecclayout nand_oob_sm = { @@ -67,44 +68,37 @@ static int sm_block_markbad(struct mtd_info *mtd, loff_t ofs) return error; } - static struct nand_flash_dev nand_smartmedia_flash_ids[] = { - {"SmartMedia 1MiB 5V", 0x6e, 256, 1, 0x1000, 0}, - {"SmartMedia 1MiB 3,3V", 0xe8, 256, 1, 0x1000, 0}, - {"SmartMedia 1MiB 3,3V", 0xec, 256, 1, 0x1000, 0}, - {"SmartMedia 2MiB 3,3V", 0xea, 256, 2, 0x1000, 0}, - {"SmartMedia 2MiB 5V", 0x64, 256, 2, 0x1000, 0}, - {"SmartMedia 2MiB 3,3V ROM", 0x5d, 512, 2, 0x2000, NAND_ROM}, - {"SmartMedia 4MiB 3,3V", 0xe3, 512, 4, 0x2000, 0}, - {"SmartMedia 4MiB 3,3/5V", 0xe5, 512, 4, 0x2000, 0}, - {"SmartMedia 4MiB 5V", 0x6b, 512, 4, 0x2000, 0}, - {"SmartMedia 4MiB 3,3V ROM", 0xd5, 512, 4, 0x2000, NAND_ROM}, - {"SmartMedia 8MiB 3,3V", 0xe6, 512, 8, 0x2000, 0}, - {"SmartMedia 8MiB 3,3V ROM", 0xd6, 512, 8, 0x2000, NAND_ROM}, - {"SmartMedia 16MiB 3,3V", 0x73, 512, 16, 0x4000, 0}, - {"SmartMedia 16MiB 3,3V ROM", 0x57, 512, 16, 0x4000, NAND_ROM}, - {"SmartMedia 32MiB 3,3V", 0x75, 512, 32, 0x4000, 0}, - {"SmartMedia 32MiB 3,3V ROM", 0x58, 512, 32, 0x4000, NAND_ROM}, - {"SmartMedia 64MiB 3,3V", 0x76, 512, 64, 0x4000, 0}, - {"SmartMedia 64MiB 3,3V ROM", 0xd9, 512, 64, 0x4000, NAND_ROM}, - {"SmartMedia 128MiB 3,3V", 0x79, 512, 128, 0x4000, 0}, - {"SmartMedia 128MiB 3,3V ROM", 0xda, 512, 128, 0x4000, NAND_ROM}, - {"SmartMedia 256MiB 3,3V", 0x71, 512, 256, 0x4000 }, - {"SmartMedia 256MiB 3,3V ROM", 0x5b, 512, 256, 0x4000, NAND_ROM}, - {NULL,} + LEGACY_ID_NAND("SmartMedia 2MiB 3,3V ROM", 0x5d, 2, SZ_8K, NAND_ROM), + LEGACY_ID_NAND("SmartMedia 4MiB 3,3V", 0xe3, 4, SZ_8K, 0), + LEGACY_ID_NAND("SmartMedia 4MiB 3,3/5V", 0xe5, 4, SZ_8K, 0), + LEGACY_ID_NAND("SmartMedia 4MiB 5V", 0x6b, 4, SZ_8K, 0), + LEGACY_ID_NAND("SmartMedia 4MiB 3,3V ROM", 0xd5, 4, SZ_8K, NAND_ROM), + LEGACY_ID_NAND("SmartMedia 8MiB 3,3V", 0xe6, 8, SZ_8K, 0), + LEGACY_ID_NAND("SmartMedia 8MiB 3,3V ROM", 0xd6, 8, SZ_8K, NAND_ROM), + LEGACY_ID_NAND("SmartMedia 16MiB 3,3V", 0x73, 16, SZ_16K, 0), + LEGACY_ID_NAND("SmartMedia 16MiB 3,3V ROM", 0x57, 16, SZ_16K, NAND_ROM), + LEGACY_ID_NAND("SmartMedia 32MiB 3,3V", 0x75, 32, SZ_16K, 0), + LEGACY_ID_NAND("SmartMedia 32MiB 3,3V ROM", 0x58, 32, SZ_16K, NAND_ROM), + LEGACY_ID_NAND("SmartMedia 64MiB 3,3V", 0x76, 64, SZ_16K, 0), + LEGACY_ID_NAND("SmartMedia 64MiB 3,3V ROM", 0xd9, 64, SZ_16K, NAND_ROM), + LEGACY_ID_NAND("SmartMedia 128MiB 3,3V", 0x79, 128, SZ_16K, 0), + LEGACY_ID_NAND("SmartMedia 128MiB 3,3V ROM", 0xda, 128, SZ_16K, NAND_ROM), + LEGACY_ID_NAND("SmartMedia 256MiB 3, 3V", 0x71, 256, SZ_16K, 0), + LEGACY_ID_NAND("SmartMedia 256MiB 3,3V ROM", 0x5b, 256, SZ_16K, NAND_ROM), + {NULL} }; static struct nand_flash_dev nand_xd_flash_ids[] = { - - {"xD 16MiB 3,3V", 0x73, 512, 16, 0x4000, 0}, - {"xD 32MiB 3,3V", 0x75, 512, 32, 0x4000, 0}, - {"xD 64MiB 3,3V", 0x76, 512, 64, 0x4000, 0}, - {"xD 128MiB 3,3V", 0x79, 512, 128, 0x4000, 0}, - {"xD 256MiB 3,3V", 0x71, 512, 256, 0x4000, NAND_BROKEN_XD}, - {"xD 512MiB 3,3V", 0xdc, 512, 512, 0x4000, NAND_BROKEN_XD}, - {"xD 1GiB 3,3V", 0xd3, 512, 1024, 0x4000, NAND_BROKEN_XD}, - {"xD 2GiB 3,3V", 0xd5, 512, 2048, 0x4000, NAND_BROKEN_XD}, - {NULL,} + LEGACY_ID_NAND("xD 16MiB 3,3V", 0x73, 16, SZ_16K, 0), + LEGACY_ID_NAND("xD 32MiB 3,3V", 0x75, 32, SZ_16K, 0), + LEGACY_ID_NAND("xD 64MiB 3,3V", 0x76, 64, SZ_16K, 0), + LEGACY_ID_NAND("xD 128MiB 3,3V", 0x79, 128, SZ_16K, 0), + LEGACY_ID_NAND("xD 256MiB 3,3V", 0x71, 256, SZ_16K, NAND_BROKEN_XD), + LEGACY_ID_NAND("xD 512MiB 3,3V", 0xdc, 512, SZ_16K, NAND_BROKEN_XD), + LEGACY_ID_NAND("xD 1GiB 3,3V", 0xd3, 1024, SZ_16K, NAND_BROKEN_XD), + LEGACY_ID_NAND("xD 2GiB 3,3V", 0xd5, 2048, SZ_16K, NAND_BROKEN_XD), + {NULL} }; int sm_register_device(struct mtd_info *mtd, int smartmedia) diff --git a/drivers/mtd/nand/txx9ndfmc.c b/drivers/mtd/nand/txx9ndfmc.c index e1e8748aa47b..7ed654c68b08 100644 --- a/drivers/mtd/nand/txx9ndfmc.c +++ b/drivers/mtd/nand/txx9ndfmc.c @@ -427,18 +427,7 @@ static struct platform_driver txx9ndfmc_driver = { }, }; -static int __init txx9ndfmc_init(void) -{ - return platform_driver_probe(&txx9ndfmc_driver, txx9ndfmc_probe); -} - -static void __exit txx9ndfmc_exit(void) -{ - platform_driver_unregister(&txx9ndfmc_driver); -} - -module_init(txx9ndfmc_init); -module_exit(txx9ndfmc_exit); +module_platform_driver_probe(txx9ndfmc_driver, txx9ndfmc_probe); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("TXx9 SoC NAND flash controller driver"); diff --git a/drivers/mtd/ofpart.c b/drivers/mtd/ofpart.c index 30bd907a260a..553d6d6d5603 100644 --- a/drivers/mtd/ofpart.c +++ b/drivers/mtd/ofpart.c @@ -55,6 +55,7 @@ static int parse_ofpart_partitions(struct mtd_info *master, while ((pp = of_get_next_child(node, pp))) { const __be32 *reg; int len; + int a_cells, s_cells; reg = of_get_property(pp, "reg", &len); if (!reg) { @@ -62,8 +63,10 @@ static int parse_ofpart_partitions(struct mtd_info *master, continue; } - (*pparts)[i].offset = be32_to_cpu(reg[0]); - (*pparts)[i].size = be32_to_cpu(reg[1]); + a_cells = of_n_addr_cells(pp); + s_cells = of_n_size_cells(pp); + (*pparts)[i].offset = of_read_number(reg, a_cells); + (*pparts)[i].size = of_read_number(reg + a_cells, s_cells); partname = of_get_property(pp, "label", &len); if (!partname) diff --git a/drivers/mtd/onenand/Kconfig b/drivers/mtd/onenand/Kconfig index 91467bb03634..ab2607273e80 100644 --- a/drivers/mtd/onenand/Kconfig +++ b/drivers/mtd/onenand/Kconfig @@ -40,7 +40,6 @@ config MTD_ONENAND_SAMSUNG config MTD_ONENAND_OTP bool "OneNAND OTP Support" - select HAVE_MTD_OTP help One Block of the NAND Flash Array memory is reserved as a One-Time Programmable Block memory area. @@ -68,10 +67,4 @@ config MTD_ONENAND_2X_PROGRAM And more recent chips -config MTD_ONENAND_SIM - tristate "OneNAND simulator support" - help - The simulator may simulate various OneNAND flash chips for the - OneNAND MTD layer. - endif # MTD_ONENAND diff --git a/drivers/mtd/onenand/Makefile b/drivers/mtd/onenand/Makefile index 2b7884c7577e..9d6540e8b3d2 100644 --- a/drivers/mtd/onenand/Makefile +++ b/drivers/mtd/onenand/Makefile @@ -10,7 +10,4 @@ obj-$(CONFIG_MTD_ONENAND_GENERIC) += generic.o obj-$(CONFIG_MTD_ONENAND_OMAP2) += omap2.o obj-$(CONFIG_MTD_ONENAND_SAMSUNG) += samsung.o -# Simulator -obj-$(CONFIG_MTD_ONENAND_SIM) += onenand_sim.o - onenand-objs = onenand_base.o onenand_bbt.o diff --git a/drivers/mtd/onenand/omap2.c b/drivers/mtd/onenand/omap2.c index eec2aedb4ab8..d98b198edd53 100644 --- a/drivers/mtd/onenand/omap2.c +++ b/drivers/mtd/onenand/omap2.c @@ -832,19 +832,7 @@ static struct platform_driver omap2_onenand_driver = { }, }; -static int __init omap2_onenand_init(void) -{ - printk(KERN_INFO "OneNAND driver initializing\n"); - return platform_driver_register(&omap2_onenand_driver); -} - -static void __exit omap2_onenand_exit(void) -{ - platform_driver_unregister(&omap2_onenand_driver); -} - -module_init(omap2_onenand_init); -module_exit(omap2_onenand_exit); +module_platform_driver(omap2_onenand_driver); MODULE_ALIAS("platform:" DRIVER_NAME); MODULE_LICENSE("GPL"); diff --git a/drivers/mtd/onenand/onenand_sim.c b/drivers/mtd/onenand/onenand_sim.c deleted file mode 100644 index 85399e3accda..000000000000 --- a/drivers/mtd/onenand/onenand_sim.c +++ /dev/null @@ -1,564 +0,0 @@ -/* - * linux/drivers/mtd/onenand/onenand_sim.c - * - * The OneNAND simulator - * - * Copyright © 2005-2007 Samsung Electronics - * Kyungmin Park - * - * Vishak G , Rohit Hagargundgi - * Flex-OneNAND simulator support - * Copyright (C) Samsung Electronics, 2008 - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -#ifndef CONFIG_ONENAND_SIM_MANUFACTURER -#define CONFIG_ONENAND_SIM_MANUFACTURER 0xec -#endif - -#ifndef CONFIG_ONENAND_SIM_DEVICE_ID -#define CONFIG_ONENAND_SIM_DEVICE_ID 0x04 -#endif - -#define CONFIG_FLEXONENAND ((CONFIG_ONENAND_SIM_DEVICE_ID >> 9) & 1) - -#ifndef CONFIG_ONENAND_SIM_VERSION_ID -#define CONFIG_ONENAND_SIM_VERSION_ID 0x1e -#endif - -#ifndef CONFIG_ONENAND_SIM_TECHNOLOGY_ID -#define CONFIG_ONENAND_SIM_TECHNOLOGY_ID CONFIG_FLEXONENAND -#endif - -/* Initial boundary values for Flex-OneNAND Simulator */ -#ifndef CONFIG_FLEXONENAND_SIM_DIE0_BOUNDARY -#define CONFIG_FLEXONENAND_SIM_DIE0_BOUNDARY 0x01 -#endif - -#ifndef CONFIG_FLEXONENAND_SIM_DIE1_BOUNDARY -#define CONFIG_FLEXONENAND_SIM_DIE1_BOUNDARY 0x01 -#endif - -static int manuf_id = CONFIG_ONENAND_SIM_MANUFACTURER; -static int device_id = CONFIG_ONENAND_SIM_DEVICE_ID; -static int version_id = CONFIG_ONENAND_SIM_VERSION_ID; -static int technology_id = CONFIG_ONENAND_SIM_TECHNOLOGY_ID; -static int boundary[] = { - CONFIG_FLEXONENAND_SIM_DIE0_BOUNDARY, - CONFIG_FLEXONENAND_SIM_DIE1_BOUNDARY, -}; - -struct onenand_flash { - void __iomem *base; - void __iomem *data; -}; - -#define ONENAND_CORE(flash) (flash->data) -#define ONENAND_CORE_SPARE(flash, this, offset) \ - ((flash->data) + (this->chipsize) + (offset >> 5)) - -#define ONENAND_MAIN_AREA(this, offset) \ - (this->base + ONENAND_DATARAM + offset) - -#define ONENAND_SPARE_AREA(this, offset) \ - (this->base + ONENAND_SPARERAM + offset) - -#define ONENAND_GET_WP_STATUS(this) \ - (readw(this->base + ONENAND_REG_WP_STATUS)) - -#define ONENAND_SET_WP_STATUS(v, this) \ - (writew(v, this->base + ONENAND_REG_WP_STATUS)) - -/* It has all 0xff chars */ -#define MAX_ONENAND_PAGESIZE (4096 + 128) -static unsigned char *ffchars; - -#if CONFIG_FLEXONENAND -#define PARTITION_NAME "Flex-OneNAND simulator partition" -#else -#define PARTITION_NAME "OneNAND simulator partition" -#endif - -static struct mtd_partition os_partitions[] = { - { - .name = PARTITION_NAME, - .offset = 0, - .size = MTDPART_SIZ_FULL, - }, -}; - -/* - * OneNAND simulator mtd - */ -struct onenand_info { - struct mtd_info mtd; - struct mtd_partition *parts; - struct onenand_chip onenand; - struct onenand_flash flash; -}; - -static struct onenand_info *info; - -#define DPRINTK(format, args...) \ -do { \ - printk(KERN_DEBUG "%s[%d]: " format "\n", __func__, \ - __LINE__, ##args); \ -} while (0) - -/** - * onenand_lock_handle - Handle Lock scheme - * @this: OneNAND device structure - * @cmd: The command to be sent - * - * Send lock command to OneNAND device. - * The lock scheme depends on chip type. - */ -static void onenand_lock_handle(struct onenand_chip *this, int cmd) -{ - int block_lock_scheme; - int status; - - status = ONENAND_GET_WP_STATUS(this); - block_lock_scheme = !(this->options & ONENAND_HAS_CONT_LOCK); - - switch (cmd) { - case ONENAND_CMD_UNLOCK: - case ONENAND_CMD_UNLOCK_ALL: - if (block_lock_scheme) - ONENAND_SET_WP_STATUS(ONENAND_WP_US, this); - else - ONENAND_SET_WP_STATUS(status | ONENAND_WP_US, this); - break; - - case ONENAND_CMD_LOCK: - if (block_lock_scheme) - ONENAND_SET_WP_STATUS(ONENAND_WP_LS, this); - else - ONENAND_SET_WP_STATUS(status | ONENAND_WP_LS, this); - break; - - case ONENAND_CMD_LOCK_TIGHT: - if (block_lock_scheme) - ONENAND_SET_WP_STATUS(ONENAND_WP_LTS, this); - else - ONENAND_SET_WP_STATUS(status | ONENAND_WP_LTS, this); - break; - - default: - break; - } -} - -/** - * onenand_bootram_handle - Handle BootRAM area - * @this: OneNAND device structure - * @cmd: The command to be sent - * - * Emulate BootRAM area. It is possible to do basic operation using BootRAM. - */ -static void onenand_bootram_handle(struct onenand_chip *this, int cmd) -{ - switch (cmd) { - case ONENAND_CMD_READID: - writew(manuf_id, this->base); - writew(device_id, this->base + 2); - writew(version_id, this->base + 4); - break; - - default: - /* REVIST: Handle other commands */ - break; - } -} - -/** - * onenand_update_interrupt - Set interrupt register - * @this: OneNAND device structure - * @cmd: The command to be sent - * - * Update interrupt register. The status depends on command. - */ -static void onenand_update_interrupt(struct onenand_chip *this, int cmd) -{ - int interrupt = ONENAND_INT_MASTER; - - switch (cmd) { - case ONENAND_CMD_READ: - case ONENAND_CMD_READOOB: - interrupt |= ONENAND_INT_READ; - break; - - case ONENAND_CMD_PROG: - case ONENAND_CMD_PROGOOB: - interrupt |= ONENAND_INT_WRITE; - break; - - case ONENAND_CMD_ERASE: - interrupt |= ONENAND_INT_ERASE; - break; - - case ONENAND_CMD_RESET: - interrupt |= ONENAND_INT_RESET; - break; - - default: - break; - } - - writew(interrupt, this->base + ONENAND_REG_INTERRUPT); -} - -/** - * onenand_check_overwrite - Check if over-write happened - * @dest: The destination pointer - * @src: The source pointer - * @count: The length to be check - * - * Returns: 0 on same, otherwise 1 - * - * Compare the source with destination - */ -static int onenand_check_overwrite(void *dest, void *src, size_t count) -{ - unsigned int *s = (unsigned int *) src; - unsigned int *d = (unsigned int *) dest; - int i; - - count >>= 2; - for (i = 0; i < count; i++) - if ((*s++ ^ *d++) != 0) - return 1; - - return 0; -} - -/** - * onenand_data_handle - Handle OneNAND Core and DataRAM - * @this: OneNAND device structure - * @cmd: The command to be sent - * @dataram: Which dataram used - * @offset: The offset to OneNAND Core - * - * Copy data from OneNAND Core to DataRAM (read) - * Copy data from DataRAM to OneNAND Core (write) - * Erase the OneNAND Core (erase) - */ -static void onenand_data_handle(struct onenand_chip *this, int cmd, - int dataram, unsigned int offset) -{ - struct mtd_info *mtd = &info->mtd; - struct onenand_flash *flash = this->priv; - int main_offset, spare_offset, die = 0; - void __iomem *src; - void __iomem *dest; - unsigned int i; - static int pi_operation; - int erasesize, rgn; - - if (dataram) { - main_offset = mtd->writesize; - spare_offset = mtd->oobsize; - } else { - main_offset = 0; - spare_offset = 0; - } - - if (pi_operation) { - die = readw(this->base + ONENAND_REG_START_ADDRESS2); - die >>= ONENAND_DDP_SHIFT; - } - - switch (cmd) { - case FLEXONENAND_CMD_PI_ACCESS: - pi_operation = 1; - break; - - case ONENAND_CMD_RESET: - pi_operation = 0; - break; - - case ONENAND_CMD_READ: - src = ONENAND_CORE(flash) + offset; - dest = ONENAND_MAIN_AREA(this, main_offset); - if (pi_operation) { - writew(boundary[die], this->base + ONENAND_DATARAM); - break; - } - memcpy(dest, src, mtd->writesize); - /* Fall through */ - - case ONENAND_CMD_READOOB: - src = ONENAND_CORE_SPARE(flash, this, offset); - dest = ONENAND_SPARE_AREA(this, spare_offset); - memcpy(dest, src, mtd->oobsize); - break; - - case ONENAND_CMD_PROG: - src = ONENAND_MAIN_AREA(this, main_offset); - dest = ONENAND_CORE(flash) + offset; - if (pi_operation) { - boundary[die] = readw(this->base + ONENAND_DATARAM); - break; - } - /* To handle partial write */ - for (i = 0; i < (1 << mtd->subpage_sft); i++) { - int off = i * this->subpagesize; - if (!memcmp(src + off, ffchars, this->subpagesize)) - continue; - if (memcmp(dest + off, ffchars, this->subpagesize) && - onenand_check_overwrite(dest + off, src + off, this->subpagesize)) - printk(KERN_ERR "over-write happened at 0x%08x\n", offset); - memcpy(dest + off, src + off, this->subpagesize); - } - /* Fall through */ - - case ONENAND_CMD_PROGOOB: - src = ONENAND_SPARE_AREA(this, spare_offset); - /* Check all data is 0xff chars */ - if (!memcmp(src, ffchars, mtd->oobsize)) - break; - - dest = ONENAND_CORE_SPARE(flash, this, offset); - if (memcmp(dest, ffchars, mtd->oobsize) && - onenand_check_overwrite(dest, src, mtd->oobsize)) - printk(KERN_ERR "OOB: over-write happened at 0x%08x\n", - offset); - memcpy(dest, src, mtd->oobsize); - break; - - case ONENAND_CMD_ERASE: - if (pi_operation) - break; - - if (FLEXONENAND(this)) { - rgn = flexonenand_region(mtd, offset); - erasesize = mtd->eraseregions[rgn].erasesize; - } else - erasesize = mtd->erasesize; - - memset(ONENAND_CORE(flash) + offset, 0xff, erasesize); - memset(ONENAND_CORE_SPARE(flash, this, offset), 0xff, - (erasesize >> 5)); - break; - - default: - break; - } -} - -/** - * onenand_command_handle - Handle command - * @this: OneNAND device structure - * @cmd: The command to be sent - * - * Emulate OneNAND command. - */ -static void onenand_command_handle(struct onenand_chip *this, int cmd) -{ - unsigned long offset = 0; - int block = -1, page = -1, bufferram = -1; - int dataram = 0; - - switch (cmd) { - case ONENAND_CMD_UNLOCK: - case ONENAND_CMD_LOCK: - case ONENAND_CMD_LOCK_TIGHT: - case ONENAND_CMD_UNLOCK_ALL: - onenand_lock_handle(this, cmd); - break; - - case ONENAND_CMD_BUFFERRAM: - /* Do nothing */ - return; - - default: - block = (int) readw(this->base + ONENAND_REG_START_ADDRESS1); - if (block & (1 << ONENAND_DDP_SHIFT)) { - block &= ~(1 << ONENAND_DDP_SHIFT); - /* The half of chip block */ - block += this->chipsize >> (this->erase_shift + 1); - } - if (cmd == ONENAND_CMD_ERASE) - break; - - page = (int) readw(this->base + ONENAND_REG_START_ADDRESS8); - page = (page >> ONENAND_FPA_SHIFT); - bufferram = (int) readw(this->base + ONENAND_REG_START_BUFFER); - bufferram >>= ONENAND_BSA_SHIFT; - bufferram &= ONENAND_BSA_DATARAM1; - dataram = (bufferram == ONENAND_BSA_DATARAM1) ? 1 : 0; - break; - } - - if (block != -1) - offset = onenand_addr(this, block); - - if (page != -1) - offset += page << this->page_shift; - - onenand_data_handle(this, cmd, dataram, offset); - - onenand_update_interrupt(this, cmd); -} - -/** - * onenand_writew - [OneNAND Interface] Emulate write operation - * @value: value to write - * @addr: address to write - * - * Write OneNAND register with value - */ -static void onenand_writew(unsigned short value, void __iomem * addr) -{ - struct onenand_chip *this = info->mtd.priv; - - /* BootRAM handling */ - if (addr < this->base + ONENAND_DATARAM) { - onenand_bootram_handle(this, value); - return; - } - /* Command handling */ - if (addr == this->base + ONENAND_REG_COMMAND) - onenand_command_handle(this, value); - - writew(value, addr); -} - -/** - * flash_init - Initialize OneNAND simulator - * @flash: OneNAND simulator data strucutres - * - * Initialize OneNAND simulator. - */ -static int __init flash_init(struct onenand_flash *flash) -{ - int density, size; - int buffer_size; - - flash->base = kzalloc(131072, GFP_KERNEL); - if (!flash->base) { - printk(KERN_ERR "Unable to allocate base address.\n"); - return -ENOMEM; - } - - density = device_id >> ONENAND_DEVICE_DENSITY_SHIFT; - density &= ONENAND_DEVICE_DENSITY_MASK; - size = ((16 << 20) << density); - - ONENAND_CORE(flash) = vmalloc(size + (size >> 5)); - if (!ONENAND_CORE(flash)) { - printk(KERN_ERR "Unable to allocate nand core address.\n"); - kfree(flash->base); - return -ENOMEM; - } - - memset(ONENAND_CORE(flash), 0xff, size + (size >> 5)); - - /* Setup registers */ - writew(manuf_id, flash->base + ONENAND_REG_MANUFACTURER_ID); - writew(device_id, flash->base + ONENAND_REG_DEVICE_ID); - writew(version_id, flash->base + ONENAND_REG_VERSION_ID); - writew(technology_id, flash->base + ONENAND_REG_TECHNOLOGY); - - if (density < 2 && (!CONFIG_FLEXONENAND)) - buffer_size = 0x0400; /* 1KiB page */ - else - buffer_size = 0x0800; /* 2KiB page */ - writew(buffer_size, flash->base + ONENAND_REG_DATA_BUFFER_SIZE); - - return 0; -} - -/** - * flash_exit - Clean up OneNAND simulator - * @flash: OneNAND simulator data structures - * - * Clean up OneNAND simulator. - */ -static void flash_exit(struct onenand_flash *flash) -{ - vfree(ONENAND_CORE(flash)); - kfree(flash->base); -} - -static int __init onenand_sim_init(void) -{ - /* Allocate all 0xff chars pointer */ - ffchars = kmalloc(MAX_ONENAND_PAGESIZE, GFP_KERNEL); - if (!ffchars) { - printk(KERN_ERR "Unable to allocate ff chars.\n"); - return -ENOMEM; - } - memset(ffchars, 0xff, MAX_ONENAND_PAGESIZE); - - /* Allocate OneNAND simulator mtd pointer */ - info = kzalloc(sizeof(struct onenand_info), GFP_KERNEL); - if (!info) { - printk(KERN_ERR "Unable to allocate core structures.\n"); - kfree(ffchars); - return -ENOMEM; - } - - /* Override write_word function */ - info->onenand.write_word = onenand_writew; - - if (flash_init(&info->flash)) { - printk(KERN_ERR "Unable to allocate flash.\n"); - kfree(ffchars); - kfree(info); - return -ENOMEM; - } - - info->parts = os_partitions; - - info->onenand.base = info->flash.base; - info->onenand.priv = &info->flash; - - info->mtd.name = "OneNAND simulator"; - info->mtd.priv = &info->onenand; - info->mtd.owner = THIS_MODULE; - - if (onenand_scan(&info->mtd, 1)) { - flash_exit(&info->flash); - kfree(ffchars); - kfree(info); - return -ENXIO; - } - - mtd_device_register(&info->mtd, info->parts, - ARRAY_SIZE(os_partitions)); - - return 0; -} - -static void __exit onenand_sim_exit(void) -{ - struct onenand_chip *this = info->mtd.priv; - struct onenand_flash *flash = this->priv; - - onenand_release(&info->mtd); - flash_exit(flash); - kfree(ffchars); - kfree(info); -} - -module_init(onenand_sim_init); -module_exit(onenand_sim_exit); - -MODULE_AUTHOR("Kyungmin Park "); -MODULE_DESCRIPTION("The OneNAND flash simulator"); -MODULE_LICENSE("GPL"); diff --git a/drivers/net/ethernet/adi/bfin_mac.c b/drivers/net/ethernet/adi/bfin_mac.c index ee705771bd2c..dada66bfe0d6 100644 --- a/drivers/net/ethernet/adi/bfin_mac.c +++ b/drivers/net/ethernet/adi/bfin_mac.c @@ -1700,7 +1700,8 @@ static int bfin_mac_probe(struct platform_device *pdev) } bfin_mac_hwtstamp_init(ndev); - if (bfin_phc_init(ndev, &pdev->dev)) { + rc = bfin_phc_init(ndev, &pdev->dev); + if (rc) { dev_err(&pdev->dev, "Cannot register PHC device!\n"); goto out_err_phc; } diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c index e1e5bb9d9054..fd7b547698ab 100644 --- a/drivers/net/ethernet/emulex/benet/be_cmds.c +++ b/drivers/net/ethernet/emulex/benet/be_cmds.c @@ -2640,9 +2640,8 @@ int be_cmd_get_mac_from_list(struct be_adapter *adapter, u8 *mac, req = get_mac_list_cmd.va; be_wrb_cmd_hdr_prepare(&req->hdr, CMD_SUBSYSTEM_COMMON, - OPCODE_COMMON_GET_MAC_LIST, sizeof(*req), - wrb, &get_mac_list_cmd); - + OPCODE_COMMON_GET_MAC_LIST, + get_mac_list_cmd.size, wrb, &get_mac_list_cmd); req->hdr.domain = domain; req->mac_type = MAC_ADDRESS_TYPE_NETWORK; req->perm_override = 1; diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 6c52a60dcdb7..a444110b060f 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -1827,7 +1827,7 @@ static void be_rx_cq_clean(struct be_rx_obj *rxo) mdelay(1); } else { be_rx_compl_discard(rxo, rxcp); - be_cq_notify(adapter, rx_cq->id, true, 1); + be_cq_notify(adapter, rx_cq->id, false, 1); if (rxcp->num_rcvd == 0) break; } @@ -2533,11 +2533,6 @@ static void be_rx_qs_destroy(struct be_adapter *adapter) q = &rxo->q; if (q->created) { be_cmd_rxq_destroy(adapter, q); - /* After the rxq is invalidated, wait for a grace time - * of 1ms for all dma to end and the flush compl to - * arrive - */ - mdelay(1); be_rx_cq_clean(rxo); } be_queue_free(adapter, q); @@ -2564,6 +2559,7 @@ static int be_close(struct net_device *netdev) * all tx skbs are freed. */ be_tx_compl_clean(adapter); + netif_tx_disable(netdev); be_rx_qs_destroy(adapter); @@ -2672,6 +2668,7 @@ static int be_open(struct net_device *netdev) if (!status) be_link_status_update(adapter, link_status); + netif_tx_start_all_queues(netdev); be_roce_dev_open(adapter); return 0; err: @@ -2783,6 +2780,8 @@ static void be_vf_clear(struct be_adapter *adapter) goto done; } + pci_disable_sriov(adapter->pdev); + for_all_vfs(adapter, vf_cfg, vf) { if (lancer_chip(adapter)) be_cmd_set_mac_list(adapter, NULL, 0, vf + 1); @@ -2792,7 +2791,6 @@ static void be_vf_clear(struct be_adapter *adapter) be_cmd_if_destroy(adapter, vf_cfg->if_handle, vf + 1); } - pci_disable_sriov(adapter->pdev); done: kfree(adapter->vf_cfg); adapter->num_vfs = 0; @@ -2889,13 +2887,8 @@ static int be_vf_setup(struct be_adapter *adapter) dev_info(dev, "Device supports %d VFs and not %d\n", adapter->dev_num_vfs, num_vfs); adapter->num_vfs = min_t(u16, num_vfs, adapter->dev_num_vfs); - - status = pci_enable_sriov(adapter->pdev, num_vfs); - if (status) { - dev_err(dev, "SRIOV enable failed\n"); - adapter->num_vfs = 0; + if (!adapter->num_vfs) return 0; - } } status = be_vf_setup_init(adapter); @@ -2944,6 +2937,15 @@ static int be_vf_setup(struct be_adapter *adapter) be_cmd_enable_vf(adapter, vf + 1); } + + if (!old_vfs) { + status = pci_enable_sriov(adapter->pdev, adapter->num_vfs); + if (status) { + dev_err(dev, "SRIOV enable failed\n"); + adapter->num_vfs = 0; + goto err; + } + } return 0; err: dev_err(dev, "VF setup failed\n"); @@ -3198,7 +3200,7 @@ static int be_setup(struct be_adapter *adapter) be_cmd_set_flow_control(adapter, adapter->tx_fc, adapter->rx_fc); - if (be_physfn(adapter) && num_vfs) { + if (be_physfn(adapter)) { if (adapter->dev_num_vfs) be_vf_setup(adapter); else diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index ceb4d43c132d..9ce5b7185fda 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -198,6 +198,11 @@ struct bufdesc_ex { #define FLAG_RX_CSUM_ENABLED (BD_ENET_RX_ICE | BD_ENET_RX_PCR) #define FLAG_RX_CSUM_ERROR (BD_ENET_RX_ICE | BD_ENET_RX_PCR) +struct fec_enet_delayed_work { + struct delayed_work delay_work; + bool timeout; +}; + /* The FEC buffer descriptors track the ring buffers. The rx_bd_base and * tx_bd_base always point to the base of the buffer descriptors. The * cur_rx and cur_tx point to the currently available buffer. @@ -232,9 +237,6 @@ struct fec_enet_private { /* The ring entries to be free()ed */ struct bufdesc *dirty_tx; - /* hold while accessing the HW like ringbuffer for tx/rx but not MAC */ - spinlock_t hw_lock; - struct platform_device *pdev; int opened; @@ -269,7 +271,7 @@ struct fec_enet_private { int hwts_rx_en; int hwts_tx_en; struct timer_list time_keep; - + struct fec_enet_delayed_work delay_work; }; void fec_ptp_init(struct net_device *ndev, struct platform_device *pdev); diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index e25bf832e6b3..aff0310a778b 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -445,6 +445,13 @@ fec_restart(struct net_device *ndev, int duplex) u32 rcntl = OPT_FRAME_SIZE | 0x04; u32 ecntl = 0x2; /* ETHEREN */ + if (netif_running(ndev)) { + netif_device_detach(ndev); + napi_disable(&fep->napi); + netif_stop_queue(ndev); + netif_tx_lock(ndev); + } + /* Whack a reset. We should wait for this. */ writel(1, fep->hwp + FEC_ECNTRL); udelay(10); @@ -605,6 +612,13 @@ fec_restart(struct net_device *ndev, int duplex) /* Enable interrupts we wish to service */ writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK); + + if (netif_running(ndev)) { + netif_device_attach(ndev); + napi_enable(&fep->napi); + netif_wake_queue(ndev); + netif_tx_unlock(ndev); + } } static void @@ -644,8 +658,22 @@ fec_timeout(struct net_device *ndev) ndev->stats.tx_errors++; - fec_restart(ndev, fep->full_duplex); - netif_wake_queue(ndev); + fep->delay_work.timeout = true; + schedule_delayed_work(&(fep->delay_work.delay_work), 0); +} + +static void fec_enet_work(struct work_struct *work) +{ + struct fec_enet_private *fep = + container_of(work, + struct fec_enet_private, + delay_work.delay_work.work); + + if (fep->delay_work.timeout) { + fep->delay_work.timeout = false; + fec_restart(fep->netdev, fep->full_duplex); + netif_wake_queue(fep->netdev); + } } static void @@ -1024,16 +1052,12 @@ static void fec_enet_adjust_link(struct net_device *ndev) { struct fec_enet_private *fep = netdev_priv(ndev); struct phy_device *phy_dev = fep->phy_dev; - unsigned long flags; - int status_change = 0; - spin_lock_irqsave(&fep->hw_lock, flags); - /* Prevent a state halted on mii error */ if (fep->mii_timeout && phy_dev->state == PHY_HALTED) { phy_dev->state = PHY_RESUMING; - goto spin_unlock; + return; } if (phy_dev->link) { @@ -1061,9 +1085,6 @@ static void fec_enet_adjust_link(struct net_device *ndev) } } -spin_unlock: - spin_unlock_irqrestore(&fep->hw_lock, flags); - if (status_change) phy_print_status(phy_dev); } @@ -1732,7 +1753,6 @@ static int fec_enet_init(struct net_device *ndev) return -ENOMEM; memset(cbd_base, 0, PAGE_SIZE); - spin_lock_init(&fep->hw_lock); fep->netdev = ndev; @@ -1952,6 +1972,7 @@ fec_probe(struct platform_device *pdev) if (fep->bufdesc_ex && fep->ptp_clock) netdev_info(ndev, "registered PHC device %d\n", fep->dev_id); + INIT_DELAYED_WORK(&(fep->delay_work.delay_work), fec_enet_work); return 0; failed_register: @@ -1984,6 +2005,7 @@ fec_drv_remove(struct platform_device *pdev) struct fec_enet_private *fep = netdev_priv(ndev); int i; + cancel_delayed_work_sync(&(fep->delay_work.delay_work)); unregister_netdev(ndev); fec_enet_mii_remove(fep); del_timer_sync(&fep->time_keep); diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c index bcf4d118e98c..c9e6b62dd000 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c @@ -889,7 +889,7 @@ static int mlx4_en_flow_replace(struct net_device *dev, .queue_mode = MLX4_NET_TRANS_Q_FIFO, .exclusive = 0, .allow_loopback = 1, - .promisc_mode = MLX4_FS_PROMISC_NONE, + .promisc_mode = MLX4_FS_REGULAR, }; rule.port = priv->port; diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index a69a908614e6..b35f94700093 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -127,7 +127,7 @@ static void mlx4_en_filter_work(struct work_struct *work) .queue_mode = MLX4_NET_TRANS_Q_LIFO, .exclusive = 1, .allow_loopback = 1, - .promisc_mode = MLX4_FS_PROMISC_NONE, + .promisc_mode = MLX4_FS_REGULAR, .port = priv->port, .priority = MLX4_DOMAIN_RFS, }; @@ -448,7 +448,7 @@ static int mlx4_en_uc_steer_add(struct mlx4_en_priv *priv, .queue_mode = MLX4_NET_TRANS_Q_FIFO, .exclusive = 0, .allow_loopback = 1, - .promisc_mode = MLX4_FS_PROMISC_NONE, + .promisc_mode = MLX4_FS_REGULAR, .priority = MLX4_DOMAIN_NIC, }; @@ -795,7 +795,7 @@ static void mlx4_en_set_promisc_mode(struct mlx4_en_priv *priv, err = mlx4_flow_steer_promisc_add(mdev->dev, priv->port, priv->base_qpn, - MLX4_FS_PROMISC_UPLINK); + MLX4_FS_ALL_DEFAULT); if (err) en_err(priv, "Failed enabling promiscuous mode\n"); priv->flags |= MLX4_EN_FLAG_MC_PROMISC; @@ -858,7 +858,7 @@ static void mlx4_en_clear_promisc_mode(struct mlx4_en_priv *priv, case MLX4_STEERING_MODE_DEVICE_MANAGED: err = mlx4_flow_steer_promisc_remove(mdev->dev, priv->port, - MLX4_FS_PROMISC_UPLINK); + MLX4_FS_ALL_DEFAULT); if (err) en_err(priv, "Failed disabling promiscuous mode\n"); priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC; @@ -919,7 +919,7 @@ static void mlx4_en_do_multicast(struct mlx4_en_priv *priv, err = mlx4_flow_steer_promisc_add(mdev->dev, priv->port, priv->base_qpn, - MLX4_FS_PROMISC_ALL_MULTI); + MLX4_FS_MC_DEFAULT); break; case MLX4_STEERING_MODE_B0: @@ -942,7 +942,7 @@ static void mlx4_en_do_multicast(struct mlx4_en_priv *priv, case MLX4_STEERING_MODE_DEVICE_MANAGED: err = mlx4_flow_steer_promisc_remove(mdev->dev, priv->port, - MLX4_FS_PROMISC_ALL_MULTI); + MLX4_FS_MC_DEFAULT); break; case MLX4_STEERING_MODE_B0: @@ -1621,10 +1621,10 @@ void mlx4_en_stop_port(struct net_device *dev, int detach) MLX4_EN_FLAG_MC_PROMISC); mlx4_flow_steer_promisc_remove(mdev->dev, priv->port, - MLX4_FS_PROMISC_UPLINK); + MLX4_FS_ALL_DEFAULT); mlx4_flow_steer_promisc_remove(mdev->dev, priv->port, - MLX4_FS_PROMISC_ALL_MULTI); + MLX4_FS_MC_DEFAULT); } else if (priv->flags & MLX4_EN_FLAG_PROMISC) { priv->flags &= ~MLX4_EN_FLAG_PROMISC; diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c index 8e3123a1df88..6000342f9725 100644 --- a/drivers/net/ethernet/mellanox/mlx4/eq.c +++ b/drivers/net/ethernet/mellanox/mlx4/eq.c @@ -497,8 +497,8 @@ static int mlx4_eq_int(struct mlx4_dev *dev, struct mlx4_eq *eq) break; case MLX4_EVENT_TYPE_SRQ_LIMIT: - mlx4_warn(dev, "%s: MLX4_EVENT_TYPE_SRQ_LIMIT\n", - __func__); + mlx4_dbg(dev, "%s: MLX4_EVENT_TYPE_SRQ_LIMIT\n", + __func__); case MLX4_EVENT_TYPE_SRQ_CATAS_ERROR: if (mlx4_is_master(dev)) { /* forward only to slave owning the SRQ */ diff --git a/drivers/net/ethernet/mellanox/mlx4/mcg.c b/drivers/net/ethernet/mellanox/mlx4/mcg.c index ffc78d2cb0cf..f3e804f2a35f 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mcg.c +++ b/drivers/net/ethernet/mellanox/mlx4/mcg.c @@ -645,25 +645,37 @@ static int find_entry(struct mlx4_dev *dev, u8 port, return err; } +static const u8 __promisc_mode[] = { + [MLX4_FS_REGULAR] = 0x0, + [MLX4_FS_ALL_DEFAULT] = 0x1, + [MLX4_FS_MC_DEFAULT] = 0x3, + [MLX4_FS_UC_SNIFFER] = 0x4, + [MLX4_FS_MC_SNIFFER] = 0x5, +}; + +int mlx4_map_sw_to_hw_steering_mode(struct mlx4_dev *dev, + enum mlx4_net_trans_promisc_mode flow_type) +{ + if (flow_type >= MLX4_FS_MODE_NUM || flow_type < 0) { + mlx4_err(dev, "Invalid flow type. type = %d\n", flow_type); + return -EINVAL; + } + return __promisc_mode[flow_type]; +} +EXPORT_SYMBOL_GPL(mlx4_map_sw_to_hw_steering_mode); + static void trans_rule_ctrl_to_hw(struct mlx4_net_trans_rule *ctrl, struct mlx4_net_trans_rule_hw_ctrl *hw) { - static const u8 __promisc_mode[] = { - [MLX4_FS_PROMISC_NONE] = 0x0, - [MLX4_FS_PROMISC_UPLINK] = 0x1, - [MLX4_FS_PROMISC_FUNCTION_PORT] = 0x2, - [MLX4_FS_PROMISC_ALL_MULTI] = 0x3, - }; - - u32 dw = 0; - - dw = ctrl->queue_mode == MLX4_NET_TRANS_Q_LIFO ? 1 : 0; - dw |= ctrl->exclusive ? (1 << 2) : 0; - dw |= ctrl->allow_loopback ? (1 << 3) : 0; - dw |= __promisc_mode[ctrl->promisc_mode] << 8; - dw |= ctrl->priority << 16; - - hw->ctrl = cpu_to_be32(dw); + u8 flags = 0; + + flags = ctrl->queue_mode == MLX4_NET_TRANS_Q_LIFO ? 1 : 0; + flags |= ctrl->exclusive ? (1 << 2) : 0; + flags |= ctrl->allow_loopback ? (1 << 3) : 0; + + hw->flags = flags; + hw->type = __promisc_mode[ctrl->promisc_mode]; + hw->prio = cpu_to_be16(ctrl->priority); hw->port = ctrl->port; hw->qpn = cpu_to_be32(ctrl->qpn); } @@ -677,29 +689,51 @@ const u16 __sw_id_hw[] = { [MLX4_NET_TRANS_RULE_ID_UDP] = 0xE006 }; +int mlx4_map_sw_to_hw_steering_id(struct mlx4_dev *dev, + enum mlx4_net_trans_rule_id id) +{ + if (id >= MLX4_NET_TRANS_RULE_NUM || id < 0) { + mlx4_err(dev, "Invalid network rule id. id = %d\n", id); + return -EINVAL; + } + return __sw_id_hw[id]; +} +EXPORT_SYMBOL_GPL(mlx4_map_sw_to_hw_steering_id); + +static const int __rule_hw_sz[] = { + [MLX4_NET_TRANS_RULE_ID_ETH] = + sizeof(struct mlx4_net_trans_rule_hw_eth), + [MLX4_NET_TRANS_RULE_ID_IB] = + sizeof(struct mlx4_net_trans_rule_hw_ib), + [MLX4_NET_TRANS_RULE_ID_IPV6] = 0, + [MLX4_NET_TRANS_RULE_ID_IPV4] = + sizeof(struct mlx4_net_trans_rule_hw_ipv4), + [MLX4_NET_TRANS_RULE_ID_TCP] = + sizeof(struct mlx4_net_trans_rule_hw_tcp_udp), + [MLX4_NET_TRANS_RULE_ID_UDP] = + sizeof(struct mlx4_net_trans_rule_hw_tcp_udp) +}; + +int mlx4_hw_rule_sz(struct mlx4_dev *dev, + enum mlx4_net_trans_rule_id id) +{ + if (id >= MLX4_NET_TRANS_RULE_NUM || id < 0) { + mlx4_err(dev, "Invalid network rule id. id = %d\n", id); + return -EINVAL; + } + + return __rule_hw_sz[id]; +} +EXPORT_SYMBOL_GPL(mlx4_hw_rule_sz); + static int parse_trans_rule(struct mlx4_dev *dev, struct mlx4_spec_list *spec, struct _rule_hw *rule_hw) { - static const size_t __rule_hw_sz[] = { - [MLX4_NET_TRANS_RULE_ID_ETH] = - sizeof(struct mlx4_net_trans_rule_hw_eth), - [MLX4_NET_TRANS_RULE_ID_IB] = - sizeof(struct mlx4_net_trans_rule_hw_ib), - [MLX4_NET_TRANS_RULE_ID_IPV6] = 0, - [MLX4_NET_TRANS_RULE_ID_IPV4] = - sizeof(struct mlx4_net_trans_rule_hw_ipv4), - [MLX4_NET_TRANS_RULE_ID_TCP] = - sizeof(struct mlx4_net_trans_rule_hw_tcp_udp), - [MLX4_NET_TRANS_RULE_ID_UDP] = - sizeof(struct mlx4_net_trans_rule_hw_tcp_udp) - }; - if (spec->id >= MLX4_NET_TRANS_RULE_NUM) { - mlx4_err(dev, "Invalid network rule id. id = %d\n", spec->id); + if (mlx4_hw_rule_sz(dev, spec->id) < 0) return -EINVAL; - } - memset(rule_hw, 0, __rule_hw_sz[spec->id]); + memset(rule_hw, 0, mlx4_hw_rule_sz(dev, spec->id)); rule_hw->id = cpu_to_be16(__sw_id_hw[spec->id]); - rule_hw->size = __rule_hw_sz[spec->id] >> 2; + rule_hw->size = mlx4_hw_rule_sz(dev, spec->id) >> 2; switch (spec->id) { case MLX4_NET_TRANS_RULE_ID_ETH: @@ -713,12 +747,12 @@ static int parse_trans_rule(struct mlx4_dev *dev, struct mlx4_spec_list *spec, rule_hw->eth.ether_type_enable = 1; rule_hw->eth.ether_type = spec->eth.ether_type; } - rule_hw->eth.vlan_id = spec->eth.vlan_id; - rule_hw->eth.vlan_id_msk = spec->eth.vlan_id_msk; + rule_hw->eth.vlan_tag = spec->eth.vlan_id; + rule_hw->eth.vlan_tag_msk = spec->eth.vlan_id_msk; break; case MLX4_NET_TRANS_RULE_ID_IB: - rule_hw->ib.qpn = spec->ib.r_qpn; + rule_hw->ib.l3_qpn = spec->ib.l3_qpn; rule_hw->ib.qpn_mask = spec->ib.qpn_msk; memcpy(&rule_hw->ib.dst_gid, &spec->ib.dst_gid, 16); memcpy(&rule_hw->ib.dst_gid_msk, &spec->ib.dst_gid_msk, 16); @@ -1136,7 +1170,7 @@ int mlx4_trans_to_dmfs_attach(struct mlx4_dev *dev, struct mlx4_qp *qp, struct mlx4_net_trans_rule rule = { .queue_mode = MLX4_NET_TRANS_Q_FIFO, .exclusive = 0, - .promisc_mode = MLX4_FS_PROMISC_NONE, + .promisc_mode = MLX4_FS_REGULAR, .priority = MLX4_DOMAIN_NIC, }; @@ -1229,11 +1263,10 @@ int mlx4_flow_steer_promisc_add(struct mlx4_dev *dev, u8 port, u64 *regid_p; switch (mode) { - case MLX4_FS_PROMISC_UPLINK: - case MLX4_FS_PROMISC_FUNCTION_PORT: + case MLX4_FS_ALL_DEFAULT: regid_p = &dev->regid_promisc_array[port]; break; - case MLX4_FS_PROMISC_ALL_MULTI: + case MLX4_FS_MC_DEFAULT: regid_p = &dev->regid_allmulti_array[port]; break; default: @@ -1260,11 +1293,10 @@ int mlx4_flow_steer_promisc_remove(struct mlx4_dev *dev, u8 port, u64 *regid_p; switch (mode) { - case MLX4_FS_PROMISC_UPLINK: - case MLX4_FS_PROMISC_FUNCTION_PORT: + case MLX4_FS_ALL_DEFAULT: regid_p = &dev->regid_promisc_array[port]; break; - case MLX4_FS_PROMISC_ALL_MULTI: + case MLX4_FS_MC_DEFAULT: regid_p = &dev->regid_allmulti_array[port]; break; default: diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h index eac3dae10efe..df15bb6631cc 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h @@ -730,85 +730,6 @@ struct mlx4_steer { struct list_head steer_entries[MLX4_NUM_STEERS]; }; -struct mlx4_net_trans_rule_hw_ctrl { - __be32 ctrl; - u8 rsvd1; - u8 funcid; - u8 vep; - u8 port; - __be32 qpn; - __be32 rsvd2; -}; - -struct mlx4_net_trans_rule_hw_ib { - u8 size; - u8 rsvd1; - __be16 id; - u32 rsvd2; - __be32 qpn; - __be32 qpn_mask; - u8 dst_gid[16]; - u8 dst_gid_msk[16]; -} __packed; - -struct mlx4_net_trans_rule_hw_eth { - u8 size; - u8 rsvd; - __be16 id; - u8 rsvd1[6]; - u8 dst_mac[6]; - u16 rsvd2; - u8 dst_mac_msk[6]; - u16 rsvd3; - u8 src_mac[6]; - u16 rsvd4; - u8 src_mac_msk[6]; - u8 rsvd5; - u8 ether_type_enable; - __be16 ether_type; - __be16 vlan_id_msk; - __be16 vlan_id; -} __packed; - -struct mlx4_net_trans_rule_hw_tcp_udp { - u8 size; - u8 rsvd; - __be16 id; - __be16 rsvd1[3]; - __be16 dst_port; - __be16 rsvd2; - __be16 dst_port_msk; - __be16 rsvd3; - __be16 src_port; - __be16 rsvd4; - __be16 src_port_msk; -} __packed; - -struct mlx4_net_trans_rule_hw_ipv4 { - u8 size; - u8 rsvd; - __be16 id; - __be32 rsvd1; - __be32 dst_ip; - __be32 dst_ip_msk; - __be32 src_ip; - __be32 src_ip_msk; -} __packed; - -struct _rule_hw { - union { - struct { - u8 size; - u8 rsvd; - __be16 id; - }; - struct mlx4_net_trans_rule_hw_eth eth; - struct mlx4_net_trans_rule_hw_ib ib; - struct mlx4_net_trans_rule_hw_ipv4 ipv4; - struct mlx4_net_trans_rule_hw_tcp_udp tcp_udp; - }; -}; - enum { MLX4_PCI_DEV_IS_VF = 1 << 0, MLX4_PCI_DEV_FORCE_SENSE_PORT = 1 << 1, diff --git a/drivers/net/ethernet/mellanox/mlx4/srq.c b/drivers/net/ethernet/mellanox/mlx4/srq.c index e329fe1f11b7..79fd269e2c54 100644 --- a/drivers/net/ethernet/mellanox/mlx4/srq.c +++ b/drivers/net/ethernet/mellanox/mlx4/srq.c @@ -298,3 +298,18 @@ void mlx4_cleanup_srq_table(struct mlx4_dev *dev) return; mlx4_bitmap_cleanup(&mlx4_priv(dev)->srq_table.bitmap); } + +struct mlx4_srq *mlx4_srq_lookup(struct mlx4_dev *dev, u32 srqn) +{ + struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table; + struct mlx4_srq *srq; + unsigned long flags; + + spin_lock_irqsave(&srq_table->lock, flags); + srq = radix_tree_lookup(&srq_table->tree, + srqn & (dev->caps.num_srqs - 1)); + spin_unlock_irqrestore(&srq_table->lock, flags); + + return srq; +} +EXPORT_SYMBOL_GPL(mlx4_srq_lookup); diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c index 07f6baa15c0c..9a95abf2dedf 100644 --- a/drivers/net/ethernet/sfc/ptp.c +++ b/drivers/net/ethernet/sfc/ptp.c @@ -912,8 +912,10 @@ static int efx_ptp_probe_channel(struct efx_channel *channel) ptp->phc_clock = ptp_clock_register(&ptp->phc_clock_info, &efx->pci_dev->dev); - if (!ptp->phc_clock) + if (IS_ERR(ptp->phc_clock)) { + rc = PTR_ERR(ptp->phc_clock); goto fail3; + } INIT_WORK(&ptp->pps_work, efx_ptp_pps_worker); ptp->pps_workwq = create_singlethread_workqueue("sfc_pps"); diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c index 66e025ad5df1..f3c2d034b32c 100644 --- a/drivers/net/ethernet/tile/tilegx.c +++ b/drivers/net/ethernet/tile/tilegx.c @@ -930,7 +930,7 @@ static int tile_net_setup_interrupts(struct net_device *dev) if (info->has_iqueue) { gxio_mpipe_request_notif_ring_interrupt( &context, cpu_x(cpu), cpu_y(cpu), - 1, ingress_irq, info->iqueue.ring); + KERNEL_PL, ingress_irq, info->iqueue.ring); } } diff --git a/drivers/net/ethernet/toshiba/spider_net.c b/drivers/net/ethernet/toshiba/spider_net.c index c655fe60121e..5734480c1ecf 100644 --- a/drivers/net/ethernet/toshiba/spider_net.c +++ b/drivers/net/ethernet/toshiba/spider_net.c @@ -1990,7 +1990,8 @@ spider_net_open(struct net_device *netdev) goto alloc_rx_failed; /* Allocate rx skbs */ - if (spider_net_alloc_rx_skbs(card)) + result = spider_net_alloc_rx_skbs(card); + if (result) goto alloc_skbs_failed; spider_net_set_multi(netdev); diff --git a/drivers/net/irda/bfin_sir.c b/drivers/net/irda/bfin_sir.c index a06fca61c9a0..22b4527321b1 100644 --- a/drivers/net/irda/bfin_sir.c +++ b/drivers/net/irda/bfin_sir.c @@ -609,7 +609,7 @@ static int bfin_sir_open(struct net_device *dev) { struct bfin_sir_self *self = netdev_priv(dev); struct bfin_sir_port *port = self->sir_port; - int err = -ENOMEM; + int err; self->newspeed = 0; self->speed = 9600; @@ -623,8 +623,10 @@ static int bfin_sir_open(struct net_device *dev) bfin_sir_set_speed(port, 9600); self->irlap = irlap_open(dev, &self->qos, DRIVER_NAME); - if (!self->irlap) + if (!self->irlap) { + err = -ENOMEM; goto err_irlap; + } INIT_WORK(&self->work, bfin_sir_send_work); diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig index 450345261bd3..1e11f2bfd9ce 100644 --- a/drivers/net/phy/Kconfig +++ b/drivers/net/phy/Kconfig @@ -126,7 +126,7 @@ config MDIO_BITBANG config MDIO_GPIO tristate "Support for GPIO lib-based bitbanged MDIO buses" - depends on MDIO_BITBANG && GENERIC_GPIO + depends on MDIO_BITBANG && GPIOLIB ---help--- Supports GPIO lib-based MDIO busses. diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c index 24fbec27a22a..078795fe6e31 100644 --- a/drivers/net/usb/cdc_ether.c +++ b/drivers/net/usb/cdc_ether.c @@ -613,6 +613,13 @@ static const struct usb_device_id products [] = { .driver_info = 0, }, +/* Dell Wireless 5804 (Novatel E371) - handled by qmi_wwan */ +{ + USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x819b, USB_CLASS_COMM, + USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), + .driver_info = 0, +}, + /* AnyDATA ADU960S - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(0x16d5, 0x650a, USB_CLASS_COMM, diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 834e405fb57a..cf887c2384e9 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -501,6 +501,13 @@ static const struct usb_device_id products[] = { USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&qmi_wwan_info, }, + { /* Dell Wireless 5804 (Novatel E371) */ + USB_DEVICE_AND_INTERFACE_INFO(0x413C, 0x819b, + USB_CLASS_COMM, + USB_CDC_SUBCLASS_ETHERNET, + USB_CDC_PROTO_NONE), + .driver_info = (unsigned long)&qmi_wwan_info, + }, { /* ADU960S */ USB_DEVICE_AND_INTERFACE_INFO(0x16d5, 0x650a, USB_CLASS_COMM, diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c index a923d61c6fc5..a79e9d334928 100644 --- a/drivers/net/usb/sierra_net.c +++ b/drivers/net/usb/sierra_net.c @@ -426,6 +426,13 @@ static void sierra_net_dosync(struct usbnet *dev) dev_dbg(&dev->udev->dev, "%s", __func__); + /* The SIERRA_NET_HIP_MSYNC_ID command appears to request that the + * firmware restart itself. After restarting, the modem will respond + * with the SIERRA_NET_HIP_RESTART_ID indication. The driver continues + * sending MSYNC commands every few seconds until it receives the + * RESTART event from the firmware + */ + /* tell modem we are ready */ status = sierra_net_send_sync(dev); if (status < 0) @@ -704,6 +711,9 @@ static int sierra_net_bind(struct usbnet *dev, struct usb_interface *intf) /* set context index initially to 0 - prepares tx hdr template */ sierra_net_set_ctx_index(priv, 0); + /* prepare sync message template */ + memcpy(priv->sync_msg, sync_tmplate, sizeof(priv->sync_msg)); + /* decrease the rx_urb_size and max_tx_size to 4k on USB 1.1 */ dev->rx_urb_size = SIERRA_NET_RX_URB_SIZE; if (dev->udev->speed != USB_SPEED_HIGH) @@ -739,11 +749,6 @@ static int sierra_net_bind(struct usbnet *dev, struct usb_interface *intf) kfree(priv); return -ENODEV; } - /* prepare sync message from template */ - memcpy(priv->sync_msg, sync_tmplate, sizeof(priv->sync_msg)); - - /* initiate the sync sequence */ - sierra_net_dosync(dev); return 0; } @@ -766,8 +771,9 @@ static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf) netdev_err(dev->net, "usb_control_msg failed, status %d\n", status); - sierra_net_set_private(dev, NULL); + usbnet_status_stop(dev); + sierra_net_set_private(dev, NULL); kfree(priv); } @@ -908,6 +914,24 @@ static const struct driver_info sierra_net_info_direct_ip = { .tx_fixup = sierra_net_tx_fixup, }; +static int +sierra_net_probe(struct usb_interface *udev, const struct usb_device_id *prod) +{ + int ret; + + ret = usbnet_probe(udev, prod); + if (ret == 0) { + struct usbnet *dev = usb_get_intfdata(udev); + + ret = usbnet_status_start(dev, GFP_KERNEL); + if (ret == 0) { + /* Interrupt URB now set up; initiate sync sequence */ + sierra_net_dosync(dev); + } + } + return ret; +} + #define DIRECT_IP_DEVICE(vend, prod) \ {USB_DEVICE_INTERFACE_NUMBER(vend, prod, 7), \ .driver_info = (unsigned long)&sierra_net_info_direct_ip}, \ @@ -930,7 +954,7 @@ MODULE_DEVICE_TABLE(usb, products); static struct usb_driver sierra_net_driver = { .name = "sierra_net", .id_table = products, - .probe = usbnet_probe, + .probe = sierra_net_probe, .disconnect = usbnet_disconnect, .suspend = usbnet_suspend, .resume = usbnet_resume, diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index 1e5a9b72650e..f95cb032394b 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -252,6 +252,70 @@ static int init_status (struct usbnet *dev, struct usb_interface *intf) return 0; } +/* Submit the interrupt URB if not previously submitted, increasing refcount */ +int usbnet_status_start(struct usbnet *dev, gfp_t mem_flags) +{ + int ret = 0; + + WARN_ON_ONCE(dev->interrupt == NULL); + if (dev->interrupt) { + mutex_lock(&dev->interrupt_mutex); + + if (++dev->interrupt_count == 1) + ret = usb_submit_urb(dev->interrupt, mem_flags); + + dev_dbg(&dev->udev->dev, "incremented interrupt URB count to %d\n", + dev->interrupt_count); + mutex_unlock(&dev->interrupt_mutex); + } + return ret; +} +EXPORT_SYMBOL_GPL(usbnet_status_start); + +/* For resume; submit interrupt URB if previously submitted */ +static int __usbnet_status_start_force(struct usbnet *dev, gfp_t mem_flags) +{ + int ret = 0; + + mutex_lock(&dev->interrupt_mutex); + if (dev->interrupt_count) { + ret = usb_submit_urb(dev->interrupt, mem_flags); + dev_dbg(&dev->udev->dev, + "submitted interrupt URB for resume\n"); + } + mutex_unlock(&dev->interrupt_mutex); + return ret; +} + +/* Kill the interrupt URB if all submitters want it killed */ +void usbnet_status_stop(struct usbnet *dev) +{ + if (dev->interrupt) { + mutex_lock(&dev->interrupt_mutex); + WARN_ON(dev->interrupt_count == 0); + + if (dev->interrupt_count && --dev->interrupt_count == 0) + usb_kill_urb(dev->interrupt); + + dev_dbg(&dev->udev->dev, + "decremented interrupt URB count to %d\n", + dev->interrupt_count); + mutex_unlock(&dev->interrupt_mutex); + } +} +EXPORT_SYMBOL_GPL(usbnet_status_stop); + +/* For suspend; always kill interrupt URB */ +static void __usbnet_status_stop_force(struct usbnet *dev) +{ + if (dev->interrupt) { + mutex_lock(&dev->interrupt_mutex); + usb_kill_urb(dev->interrupt); + dev_dbg(&dev->udev->dev, "killed interrupt URB for suspend\n"); + mutex_unlock(&dev->interrupt_mutex); + } +} + /* Passes this packet up the stack, updating its accounting. * Some link protocols batch packets, so their rx_fixup paths * can return clones as well as just modify the original skb. @@ -725,7 +789,7 @@ int usbnet_stop (struct net_device *net) if (!(info->flags & FLAG_AVOID_UNLINK_URBS)) usbnet_terminate_urbs(dev); - usb_kill_urb(dev->interrupt); + usbnet_status_stop(dev); usbnet_purge_paused_rxq(dev); @@ -787,7 +851,7 @@ int usbnet_open (struct net_device *net) /* start any status interrupt transfer */ if (dev->interrupt) { - retval = usb_submit_urb (dev->interrupt, GFP_KERNEL); + retval = usbnet_status_start(dev, GFP_KERNEL); if (retval < 0) { netif_err(dev, ifup, dev->net, "intr submit %d\n", retval); @@ -1458,6 +1522,8 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod) dev->delay.data = (unsigned long) dev; init_timer (&dev->delay); mutex_init (&dev->phy_mutex); + mutex_init(&dev->interrupt_mutex); + dev->interrupt_count = 0; dev->net = net; strcpy (net->name, "usb%d"); @@ -1593,7 +1659,7 @@ int usbnet_suspend (struct usb_interface *intf, pm_message_t message) */ netif_device_detach (dev->net); usbnet_terminate_urbs(dev); - usb_kill_urb(dev->interrupt); + __usbnet_status_stop_force(dev); /* * reattach so runtime management can use and @@ -1613,9 +1679,8 @@ int usbnet_resume (struct usb_interface *intf) int retval; if (!--dev->suspend_count) { - /* resume interrupt URBs */ - if (dev->interrupt && test_bit(EVENT_DEV_OPEN, &dev->flags)) - usb_submit_urb(dev->interrupt, GFP_NOIO); + /* resume interrupt URB if it was previously submitted */ + __usbnet_status_start_force(dev, GFP_NOIO); spin_lock_irq(&dev->txq.lock); while ((res = usb_get_from_anchor(&dev->deferred))) { diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c index 23049aeca662..d5a57a9e329c 100644 --- a/drivers/of/of_mdio.c +++ b/drivers/of/of_mdio.c @@ -84,13 +84,10 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np) phy = get_phy_device(mdio, addr, is_c45); if (!phy || IS_ERR(phy)) { - phy = phy_device_create(mdio, addr, 0, false, NULL); - if (!phy || IS_ERR(phy)) { - dev_err(&mdio->dev, - "error creating PHY at address %i\n", - addr); - continue; - } + dev_err(&mdio->dev, + "cannot get PHY at address %i\n", + addr); + continue; } /* Associate the OF node with the device structure so it diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 748f8f3e9ff5..32e66a6f12d9 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -174,6 +174,7 @@ int pci_bus_add_device(struct pci_dev *dev) * Can not put in pci_device_add yet because resources * are not assigned yet for some devices. */ + pci_fixup_device(pci_fixup_final, dev); pci_create_sysfs_dev_files(dev); dev->match_driver = true; diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index d40bed726769..2c1075213bec 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -563,8 +563,10 @@ static int msi_capability_init(struct pci_dev *dev, int nvec) entry->msi_attrib.default_irq = dev->irq; /* Save IOAPIC IRQ */ entry->msi_attrib.pos = dev->msi_cap; - entry->mask_pos = dev->msi_cap + (control & PCI_MSI_FLAGS_64BIT) ? - PCI_MSI_MASK_64 : PCI_MSI_MASK_32; + if (control & PCI_MSI_FLAGS_64BIT) + entry->mask_pos = dev->msi_cap + PCI_MSI_MASK_64; + else + entry->mask_pos = dev->msi_cap + PCI_MSI_MASK_32; /* All MSIs are unmasked by default, Mask them all */ if (entry->msi_attrib.maskbit) pci_read_config_dword(dev, entry->mask_pos, &entry->masked); diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 631aeb7d2d2d..70f10fa3c1b2 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1341,7 +1341,6 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) list_add_tail(&dev->bus_list, &bus->devices); up_write(&pci_bus_sem); - pci_fixup_device(pci_fixup_final, dev); ret = pcibios_add_device(dev); WARN_ON(ret < 0); diff --git a/drivers/pinctrl/sh-pfc/Kconfig b/drivers/pinctrl/sh-pfc/Kconfig index 0e1f99c33d47..f8a2ae413c7f 100644 --- a/drivers/pinctrl/sh-pfc/Kconfig +++ b/drivers/pinctrl/sh-pfc/Kconfig @@ -6,7 +6,7 @@ if ARCH_SHMOBILE || SUPERH config PINCTRL_SH_PFC # XXX move off the gpio dependency - depends on GENERIC_GPIO + depends on GPIOLIB select GPIO_SH_PFC if ARCH_REQUIRE_GPIOLIB select PINMUX select PINCONF @@ -40,19 +40,19 @@ config PINCTRL_PFC_R8A7779 config PINCTRL_PFC_SH7203 def_bool y depends on CPU_SUBTYPE_SH7203 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7264 def_bool y depends on CPU_SUBTYPE_SH7264 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7269 def_bool y depends on CPU_SUBTYPE_SH7269 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7372 @@ -68,55 +68,55 @@ config PINCTRL_PFC_SH73A0 config PINCTRL_PFC_SH7720 def_bool y depends on CPU_SUBTYPE_SH7720 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7722 def_bool y depends on CPU_SUBTYPE_SH7722 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7723 def_bool y depends on CPU_SUBTYPE_SH7723 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7724 def_bool y depends on CPU_SUBTYPE_SH7724 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7734 def_bool y depends on CPU_SUBTYPE_SH7734 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7757 def_bool y depends on CPU_SUBTYPE_SH7757 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7785 def_bool y depends on CPU_SUBTYPE_SH7785 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SH7786 def_bool y depends on CPU_SUBTYPE_SH7786 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC config PINCTRL_PFC_SHX3 def_bool y depends on CPU_SUBTYPE_SHX3 - depends on GENERIC_GPIO + depends on GPIOLIB select PINCTRL_SH_PFC endif diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig index a5d97eaee99e..8bb26446037e 100644 --- a/drivers/regulator/Kconfig +++ b/drivers/regulator/Kconfig @@ -66,7 +66,7 @@ config REGULATOR_USERSPACE_CONSUMER config REGULATOR_GPIO tristate "GPIO regulator support" - depends on GENERIC_GPIO + depends on GPIOLIB help This driver provides support for regulators that can be controlled via gpios. diff --git a/drivers/rtc/rtc-tile.c b/drivers/rtc/rtc-tile.c index 249b6531f119..fc3dee95f166 100644 --- a/drivers/rtc/rtc-tile.c +++ b/drivers/rtc/rtc-tile.c @@ -146,6 +146,7 @@ exit_driver_unregister: */ static void __exit tile_rtc_driver_exit(void) { + platform_device_unregister(tile_rtc_platform_device); platform_driver_unregister(&tile_rtc_platform_driver); } diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index 07ba32b07fb0..6eca019bcf30 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -822,8 +822,7 @@ dcssblk_make_request(struct request_queue *q, struct bio *bio) if ((bio->bi_sector & 7) != 0 || (bio->bi_size & 4095) != 0) /* Request is not page-aligned. */ goto fail; - if (((bio->bi_size >> 9) + bio->bi_sector) - > get_capacity(bio->bi_bdev->bd_disk)) { + if (bio_end_sector(bio) > get_capacity(bio->bi_bdev->bd_disk)) { /* Request beyond end of DCSS segment. */ goto fail; } diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c index 55cbd0180159..f42b0e15410f 100644 --- a/drivers/scsi/libsas/sas_expander.c +++ b/drivers/scsi/libsas/sas_expander.c @@ -2163,10 +2163,10 @@ int sas_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy, } /* do we need to support multiple segments? */ - if (req->bio->bi_vcnt > 1 || rsp->bio->bi_vcnt > 1) { + if (bio_segments(req->bio) > 1 || bio_segments(rsp->bio) > 1) { printk("%s: multiple segments req %u %u, rsp %u %u\n", - __func__, req->bio->bi_vcnt, blk_rq_bytes(req), - rsp->bio->bi_vcnt, blk_rq_bytes(rsp)); + __func__, bio_segments(req->bio), blk_rq_bytes(req), + bio_segments(rsp->bio), blk_rq_bytes(rsp)); return -EINVAL; } diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c index 8c2ffbe6af0f..193e7ae90c3b 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c @@ -1939,7 +1939,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy, ioc->transport_cmds.status = MPT2_CMD_PENDING; /* Check if the request is split across multiple segments */ - if (req->bio->bi_vcnt > 1) { + if (bio_segments(req->bio) > 1) { u32 offset = 0; /* Allocate memory and copy the request */ @@ -1971,7 +1971,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy, /* Check if the response needs to be populated across * multiple segments */ - if (rsp->bio->bi_vcnt > 1) { + if (bio_segments(rsp->bio) > 1) { pci_addr_in = pci_alloc_consistent(ioc->pdev, blk_rq_bytes(rsp), &pci_dma_in); if (!pci_addr_in) { @@ -2038,7 +2038,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy, sgl_flags = (MPI2_SGE_FLAGS_SIMPLE_ELEMENT | MPI2_SGE_FLAGS_END_OF_BUFFER | MPI2_SGE_FLAGS_HOST_TO_IOC); sgl_flags = sgl_flags << MPI2_SGE_FLAGS_SHIFT; - if (req->bio->bi_vcnt > 1) { + if (bio_segments(req->bio) > 1) { ioc->base_add_sg_single(psge, sgl_flags | (blk_rq_bytes(req) - 4), pci_dma_out); } else { @@ -2054,7 +2054,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy, MPI2_SGE_FLAGS_LAST_ELEMENT | MPI2_SGE_FLAGS_END_OF_BUFFER | MPI2_SGE_FLAGS_END_OF_LIST); sgl_flags = sgl_flags << MPI2_SGE_FLAGS_SHIFT; - if (rsp->bio->bi_vcnt > 1) { + if (bio_segments(rsp->bio) > 1) { ioc->base_add_sg_single(psge, sgl_flags | (blk_rq_bytes(rsp) + 4), pci_dma_in); } else { @@ -2099,7 +2099,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy, le16_to_cpu(mpi_reply->ResponseDataLength); /* check if the resp needs to be copied from the allocated * pci mem */ - if (rsp->bio->bi_vcnt > 1) { + if (bio_segments(rsp->bio) > 1) { u32 offset = 0; u32 bytes_to_copy = le16_to_cpu(mpi_reply->ResponseDataLength); diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig index 141d8c10b764..92a9345d7a6b 100644 --- a/drivers/spi/Kconfig +++ b/drivers/spi/Kconfig @@ -62,7 +62,7 @@ config SPI_ALTERA config SPI_ATH79 tristate "Atheros AR71XX/AR724X/AR913X SPI controller driver" - depends on ATH79 && GENERIC_GPIO + depends on ATH79 && GPIOLIB select SPI_BITBANG help This enables support for the SPI controller present on the @@ -175,7 +175,7 @@ config SPI_FALCON config SPI_GPIO tristate "GPIO-based bitbanging SPI Master" - depends on GENERIC_GPIO + depends on GPIOLIB select SPI_BITBANG help This simple GPIO bitbanging SPI master uses the arch-neutral GPIO @@ -259,7 +259,7 @@ config SPI_FSL_ESPI config SPI_OC_TINY tristate "OpenCores tiny SPI" - depends on GENERIC_GPIO + depends on GPIOLIB select SPI_BITBANG help This is the driver for OpenCores tiny SPI master controller. @@ -457,7 +457,7 @@ config SPI_TOPCLIFF_PCH config SPI_TXX9 tristate "Toshiba TXx9 SPI controller" - depends on GENERIC_GPIO && CPU_TX49XX + depends on GPIOLIB && CPU_TX49XX help SPI driver for Toshiba TXx9 MIPS SoCs diff --git a/drivers/ssb/driver_mipscore.c b/drivers/ssb/driver_mipscore.c index fa385a368a56..09077067b0c8 100644 --- a/drivers/ssb/driver_mipscore.c +++ b/drivers/ssb/driver_mipscore.c @@ -18,7 +18,7 @@ #include "ssb_private.h" -static const char *part_probes[] = { "bcm47xxpart", NULL }; +static const char * const part_probes[] = { "bcm47xxpart", NULL }; static struct physmap_flash_data ssb_pflash_data = { .part_probe_types = part_probes, diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index 9f61d46da157..c0c95be0f969 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -54,7 +54,7 @@ config ANDROID_TIMED_OUTPUT config ANDROID_TIMED_GPIO tristate "Android timed gpio driver" - depends on GENERIC_GPIO && ANDROID_TIMED_OUTPUT + depends on GPIOLIB && ANDROID_TIMED_OUTPUT default n config ANDROID_LOW_MEMORY_KILLER diff --git a/drivers/staging/iio/accel/Kconfig b/drivers/staging/iio/accel/Kconfig index e2e786dc9c7b..ad45dfbdf417 100644 --- a/drivers/staging/iio/accel/Kconfig +++ b/drivers/staging/iio/accel/Kconfig @@ -61,7 +61,7 @@ config LIS3L02DQ depends on SPI select IIO_TRIGGER if IIO_BUFFER depends on !IIO_BUFFER || IIO_KFIFO_BUF - depends on GENERIC_GPIO + depends on GPIOLIB help Say yes here to build SPI support for the ST microelectronics accelerometer. The driver supplies direct access via sysfs files diff --git a/drivers/staging/iio/adc/Kconfig b/drivers/staging/iio/adc/Kconfig index d990829008ff..cabc7a367db5 100644 --- a/drivers/staging/iio/adc/Kconfig +++ b/drivers/staging/iio/adc/Kconfig @@ -73,7 +73,7 @@ config AD7780 config AD7816 tristate "Analog Devices AD7816/7/8 temperature sensor and ADC driver" depends on SPI - depends on GENERIC_GPIO + depends on GPIOLIB help Say yes here to build support for Analog Devices AD7816/7/8 temperature sensors and ADC. diff --git a/drivers/staging/iio/addac/Kconfig b/drivers/staging/iio/addac/Kconfig index 698a8970b372..e6795e0bed1d 100644 --- a/drivers/staging/iio/addac/Kconfig +++ b/drivers/staging/iio/addac/Kconfig @@ -5,7 +5,7 @@ menu "Analog digital bi-direction converters" config ADT7316 tristate "Analog Devices ADT7316/7/8 ADT7516/7/9 temperature sensor, ADC and DAC driver" - depends on GENERIC_GPIO + depends on GPIOLIB help Say yes here to build support for Analog Devices ADT7316, ADT7317, ADT7318 and ADT7516, ADT7517, ADT7519 temperature sensors, ADC and DAC. diff --git a/drivers/staging/iio/resolver/Kconfig b/drivers/staging/iio/resolver/Kconfig index 49f69ef986fc..ce360f163216 100644 --- a/drivers/staging/iio/resolver/Kconfig +++ b/drivers/staging/iio/resolver/Kconfig @@ -13,7 +13,7 @@ config AD2S90 config AD2S1200 tristate "Analog Devices ad2s1200/ad2s1205 driver" depends on SPI - depends on GENERIC_GPIO + depends on GPIOLIB help Say yes here to build support for Analog Devices spi resolver to digital converters, ad2s1200 and ad2s1205, provides direct access @@ -22,7 +22,7 @@ config AD2S1200 config AD2S1210 tristate "Analog Devices ad2s1210 driver" depends on SPI - depends on GENERIC_GPIO + depends on GPIOLIB help Say yes here to build support for Analog Devices spi resolver to digital converters, ad2s1210, provides direct access via sysfs. diff --git a/drivers/staging/iio/trigger/Kconfig b/drivers/staging/iio/trigger/Kconfig index d44d3ad26fa5..1a051da62505 100644 --- a/drivers/staging/iio/trigger/Kconfig +++ b/drivers/staging/iio/trigger/Kconfig @@ -14,7 +14,7 @@ config IIO_PERIODIC_RTC_TRIGGER config IIO_GPIO_TRIGGER tristate "GPIO trigger" - depends on GENERIC_GPIO + depends on GPIOLIB help Provides support for using GPIO pins as IIO triggers. diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig index a764f165b589..5e3c02554d99 100644 --- a/drivers/thermal/Kconfig +++ b/drivers/thermal/Kconfig @@ -67,15 +67,16 @@ config THERMAL_GOV_USER_SPACE Enable this to let the user space manage the platform thermals. config CPU_THERMAL - tristate "generic cpu cooling support" + bool "generic cpu cooling support" depends on CPU_FREQ select CPU_FREQ_TABLE help This implements the generic cpu cooling mechanism through frequency - reduction, cpu hotplug and any other ways of reducing temperature. An - ACPI version of this already exists(drivers/acpi/processor_thermal.c). + reduction. An ACPI version of this already exists + (drivers/acpi/processor_thermal.c). This will be useful for platforms using the generic thermal interface and not the ACPI interface. + If you want this support, you should say Y here. config THERMAL_EMULATION @@ -86,6 +87,10 @@ config THERMAL_EMULATION user can manually input temperature and test the different trip threshold behaviour for simulation purpose. + WARNING: Be careful while enabling this option on production systems, + because userland can easily disable the thermal policy by simply + flooding this sysfs node with low temperature values. + config SPEAR_THERMAL bool "SPEAr thermal sensor driver" depends on PLAT_SPEAR @@ -117,15 +122,6 @@ config EXYNOS_THERMAL If you say yes here you get support for TMU (Thermal Management Unit) on SAMSUNG EXYNOS series of SoC. -config EXYNOS_THERMAL_EMUL - bool "EXYNOS TMU emulation mode support" - depends on EXYNOS_THERMAL - help - Exynos 4412 and 4414 and 5 series has emulation mode on TMU. - Enable this option will be make sysfs node in exynos thermal platform - device directory to support emulation mode. With emulation mode sysfs - node, you can manually input temperature to TMU for simulation purpose. - config DOVE_THERMAL tristate "Temperature sensor on Marvell Dove SoCs" depends on ARCH_DOVE @@ -144,6 +140,14 @@ config DB8500_THERMAL created. Cooling devices can be bound to the trip points to cool this thermal zone if trip points reached. +config ARMADA_THERMAL + tristate "Armada 370/XP thermal management" + depends on ARCH_MVEBU + depends on OF + help + Enable this option if you want to have support for thermal management + controller present in Armada 370 and Armada XP SoC. + config DB8500_CPUFREQ_COOLING tristate "DB8500 cpufreq cooling" depends on ARCH_U8500 diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile index d3a2b38c31e8..c054d410ac3f 100644 --- a/drivers/thermal/Makefile +++ b/drivers/thermal/Makefile @@ -3,14 +3,15 @@ # obj-$(CONFIG_THERMAL) += thermal_sys.o +thermal_sys-y += thermal_core.o # governors -obj-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o -obj-$(CONFIG_THERMAL_GOV_STEP_WISE) += step_wise.o -obj-$(CONFIG_THERMAL_GOV_USER_SPACE) += user_space.o +thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o +thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE) += step_wise.o +thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE) += user_space.o # cpufreq cooling -obj-$(CONFIG_CPU_THERMAL) += cpu_cooling.o +thermal_sys-$(CONFIG_CPU_THERMAL) += cpu_cooling.o # platform thermal drivers obj-$(CONFIG_SPEAR_THERMAL) += spear_thermal.o @@ -19,6 +20,7 @@ obj-$(CONFIG_KIRKWOOD_THERMAL) += kirkwood_thermal.o obj-$(CONFIG_EXYNOS_THERMAL) += exynos_thermal.o obj-$(CONFIG_DOVE_THERMAL) += dove_thermal.o obj-$(CONFIG_DB8500_THERMAL) += db8500_thermal.o +obj-$(CONFIG_ARMADA_THERMAL) += armada_thermal.o obj-$(CONFIG_DB8500_CPUFREQ_COOLING) += db8500_cpufreq_cooling.o obj-$(CONFIG_INTEL_POWERCLAMP) += intel_powerclamp.o diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c new file mode 100644 index 000000000000..5b4d75fd7b49 --- /dev/null +++ b/drivers/thermal/armada_thermal.c @@ -0,0 +1,232 @@ +/* + * Marvell Armada 370/XP thermal sensor driver + * + * Copyright (C) 2013 Marvell + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified under those terms. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define THERMAL_VALID_OFFSET 9 +#define THERMAL_VALID_MASK 0x1 +#define THERMAL_TEMP_OFFSET 10 +#define THERMAL_TEMP_MASK 0x1ff + +/* Thermal Manager Control and Status Register */ +#define PMU_TDC0_SW_RST_MASK (0x1 << 1) +#define PMU_TM_DISABLE_OFFS 0 +#define PMU_TM_DISABLE_MASK (0x1 << PMU_TM_DISABLE_OFFS) +#define PMU_TDC0_REF_CAL_CNT_OFFS 11 +#define PMU_TDC0_REF_CAL_CNT_MASK (0x1ff << PMU_TDC0_REF_CAL_CNT_OFFS) +#define PMU_TDC0_OTF_CAL_MASK (0x1 << 30) +#define PMU_TDC0_START_CAL_MASK (0x1 << 25) + +struct armada_thermal_ops; + +/* Marvell EBU Thermal Sensor Dev Structure */ +struct armada_thermal_priv { + void __iomem *sensor; + void __iomem *control; + struct armada_thermal_ops *ops; +}; + +struct armada_thermal_ops { + /* Initialize the sensor */ + void (*init_sensor)(struct armada_thermal_priv *); + + /* Test for a valid sensor value (optional) */ + bool (*is_valid)(struct armada_thermal_priv *); +}; + +static void armadaxp_init_sensor(struct armada_thermal_priv *priv) +{ + unsigned long reg; + + reg = readl_relaxed(priv->control); + reg |= PMU_TDC0_OTF_CAL_MASK; + writel(reg, priv->control); + + /* Reference calibration value */ + reg &= ~PMU_TDC0_REF_CAL_CNT_MASK; + reg |= (0xf1 << PMU_TDC0_REF_CAL_CNT_OFFS); + writel(reg, priv->control); + + /* Reset the sensor */ + reg = readl_relaxed(priv->control); + writel((reg | PMU_TDC0_SW_RST_MASK), priv->control); + + writel(reg, priv->control); + + /* Enable the sensor */ + reg = readl_relaxed(priv->sensor); + reg &= ~PMU_TM_DISABLE_MASK; + writel(reg, priv->sensor); +} + +static void armada370_init_sensor(struct armada_thermal_priv *priv) +{ + unsigned long reg; + + reg = readl_relaxed(priv->control); + reg |= PMU_TDC0_OTF_CAL_MASK; + writel(reg, priv->control); + + /* Reference calibration value */ + reg &= ~PMU_TDC0_REF_CAL_CNT_MASK; + reg |= (0xf1 << PMU_TDC0_REF_CAL_CNT_OFFS); + writel(reg, priv->control); + + reg &= ~PMU_TDC0_START_CAL_MASK; + writel(reg, priv->control); + + mdelay(10); +} + +static bool armada_is_valid(struct armada_thermal_priv *priv) +{ + unsigned long reg = readl_relaxed(priv->sensor); + + return (reg >> THERMAL_VALID_OFFSET) & THERMAL_VALID_MASK; +} + +static int armada_get_temp(struct thermal_zone_device *thermal, + unsigned long *temp) +{ + struct armada_thermal_priv *priv = thermal->devdata; + unsigned long reg; + + /* Valid check */ + if (priv->ops->is_valid && !priv->ops->is_valid(priv)) { + dev_err(&thermal->device, + "Temperature sensor reading not valid\n"); + return -EIO; + } + + reg = readl_relaxed(priv->sensor); + reg = (reg >> THERMAL_TEMP_OFFSET) & THERMAL_TEMP_MASK; + *temp = (3153000000UL - (10000000UL*reg)) / 13825; + return 0; +} + +static struct thermal_zone_device_ops ops = { + .get_temp = armada_get_temp, +}; + +static const struct armada_thermal_ops armadaxp_ops = { + .init_sensor = armadaxp_init_sensor, +}; + +static const struct armada_thermal_ops armada370_ops = { + .is_valid = armada_is_valid, + .init_sensor = armada370_init_sensor, +}; + +static const struct of_device_id armada_thermal_id_table[] = { + { + .compatible = "marvell,armadaxp-thermal", + .data = &armadaxp_ops, + }, + { + .compatible = "marvell,armada370-thermal", + .data = &armada370_ops, + }, + { + /* sentinel */ + }, +}; +MODULE_DEVICE_TABLE(of, armada_thermal_id_table); + +static int armada_thermal_probe(struct platform_device *pdev) +{ + struct thermal_zone_device *thermal; + const struct of_device_id *match; + struct armada_thermal_priv *priv; + struct resource *res; + + match = of_match_device(armada_thermal_id_table, &pdev->dev); + if (!match) + return -ENODEV; + + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (!res) { + dev_err(&pdev->dev, "Failed to get platform resource\n"); + return -ENODEV; + } + + priv->sensor = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(priv->sensor)) + return PTR_ERR(priv->sensor); + + res = platform_get_resource(pdev, IORESOURCE_MEM, 1); + if (!res) { + dev_err(&pdev->dev, "Failed to get platform resource\n"); + return -ENODEV; + } + + priv->control = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(priv->control)) + return PTR_ERR(priv->control); + + priv->ops = (struct armada_thermal_ops *)match->data; + priv->ops->init_sensor(priv); + + thermal = thermal_zone_device_register("armada_thermal", 0, 0, + priv, &ops, NULL, 0, 0); + if (IS_ERR(thermal)) { + dev_err(&pdev->dev, + "Failed to register thermal zone device\n"); + return PTR_ERR(thermal); + } + + platform_set_drvdata(pdev, thermal); + + return 0; +} + +static int armada_thermal_exit(struct platform_device *pdev) +{ + struct thermal_zone_device *armada_thermal = + platform_get_drvdata(pdev); + + thermal_zone_device_unregister(armada_thermal); + platform_set_drvdata(pdev, NULL); + + return 0; +} + +static struct platform_driver armada_thermal_driver = { + .probe = armada_thermal_probe, + .remove = armada_thermal_exit, + .driver = { + .name = "armada_thermal", + .owner = THIS_MODULE, + .of_match_table = of_match_ptr(armada_thermal_id_table), + }, +}; + +module_platform_driver(armada_thermal_driver); + +MODULE_AUTHOR("Ezequiel Garcia "); +MODULE_DESCRIPTION("Armada 370/XP thermal driver"); +MODULE_LICENSE("GPL v2"); diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index 8dc44cbb3e09..c94bf2e5de62 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -20,10 +20,8 @@ * * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -#include #include #include -#include #include #include #include @@ -31,21 +29,19 @@ #include /** - * struct cpufreq_cooling_device + * struct cpufreq_cooling_device - data for cooling device with cpufreq * @id: unique integer value corresponding to each cpufreq_cooling_device * registered. - * @cool_dev: thermal_cooling_device pointer to keep track of the the - * egistered cooling device. + * @cool_dev: thermal_cooling_device pointer to keep track of the + * registered cooling device. * @cpufreq_state: integer value representing the current state of cpufreq * cooling devices. * @cpufreq_val: integer value representing the absolute value of the clipped * frequency. * @allowed_cpus: all the cpus involved for this cpufreq_cooling_device. - * @node: list_head to link all cpufreq_cooling_device together. * * This structure is required for keeping information of each - * cpufreq_cooling_device registered as a list whose head is represented by - * cooling_cpufreq_list. In order to prevent corruption of this list a + * cpufreq_cooling_device registered. In order to prevent corruption of this a * mutex lock cooling_cpufreq_lock is used. */ struct cpufreq_cooling_device { @@ -54,9 +50,7 @@ struct cpufreq_cooling_device { unsigned int cpufreq_state; unsigned int cpufreq_val; struct cpumask allowed_cpus; - struct list_head node; }; -static LIST_HEAD(cooling_cpufreq_list); static DEFINE_IDR(cpufreq_idr); static DEFINE_MUTEX(cooling_cpufreq_lock); @@ -70,6 +64,11 @@ static struct cpufreq_cooling_device *notify_device; * get_idr - function to get a unique id. * @idr: struct idr * handle used to create a id. * @id: int * value generated by this function. + * + * This function will populate @id with an unique + * id, using the idr API. + * + * Return: 0 on success, an error code on failure. */ static int get_idr(struct idr *idr, int *id) { @@ -81,6 +80,7 @@ static int get_idr(struct idr *idr, int *id) if (unlikely(ret < 0)) return ret; *id = ret; + return 0; } @@ -99,63 +99,162 @@ static void release_idr(struct idr *idr, int id) /* Below code defines functions to be used for cpufreq as cooling device */ /** - * is_cpufreq_valid - function to check if a cpu has frequency transition policy. + * is_cpufreq_valid - function to check frequency transitioning capability. * @cpu: cpu for which check is needed. + * + * This function will check the current state of the system if + * it is capable of changing the frequency for a given @cpu. + * + * Return: 0 if the system is not currently capable of changing + * the frequency of given cpu. !0 in case the frequency is changeable. */ static int is_cpufreq_valid(int cpu) { struct cpufreq_policy policy; + return !cpufreq_get_policy(&policy, cpu); } +enum cpufreq_cooling_property { + GET_LEVEL, + GET_FREQ, + GET_MAXL, +}; + /** - * get_cpu_frequency - get the absolute value of frequency from level. - * @cpu: cpu for which frequency is fetched. - * @level: level of frequency, equals cooling state of cpu cooling device - * e.g level=0 --> 1st MAX FREQ, level=1 ---> 2nd MAX FREQ, .... etc + * get_property - fetch a property of interest for a give cpu. + * @cpu: cpu for which the property is required + * @input: query parameter + * @output: query return + * @property: type of query (frequency, level, max level) + * + * This is the common function to + * 1. get maximum cpu cooling states + * 2. translate frequency to cooling state + * 3. translate cooling state to frequency + * Note that the code may be not in good shape + * but it is written in this way in order to: + * a) reduce duplicate code as most of the code can be shared. + * b) make sure the logic is consistent when translating between + * cooling states and frequencies. + * + * Return: 0 on success, -EINVAL when invalid parameters are passed. */ -static unsigned int get_cpu_frequency(unsigned int cpu, unsigned long level) +static int get_property(unsigned int cpu, unsigned long input, + unsigned int *output, + enum cpufreq_cooling_property property) { - int ret = 0, i = 0; - unsigned long level_index; - bool descend = false; + int i, j; + unsigned long max_level = 0, level = 0; + unsigned int freq = CPUFREQ_ENTRY_INVALID; + int descend = -1; struct cpufreq_frequency_table *table = cpufreq_frequency_get_table(cpu); + + if (!output) + return -EINVAL; + if (!table) - return ret; + return -EINVAL; - while (table[i].frequency != CPUFREQ_TABLE_END) { + for (i = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) { + /* ignore invalid entries */ if (table[i].frequency == CPUFREQ_ENTRY_INVALID) continue; - /*check if table in ascending or descending order*/ - if ((table[i + 1].frequency != CPUFREQ_TABLE_END) && - (table[i + 1].frequency < table[i].frequency) - && !descend) { - descend = true; - } + /* ignore duplicate entry */ + if (freq == table[i].frequency) + continue; + + /* get the frequency order */ + if (freq != CPUFREQ_ENTRY_INVALID && descend != -1) + descend = !!(freq > table[i].frequency); - /*return if level matched and table in descending order*/ - if (descend && i == level) - return table[i].frequency; - i++; + freq = table[i].frequency; + max_level++; } - i--; - if (level > i || descend) - return ret; - level_index = i - level; + /* get max level */ + if (property == GET_MAXL) { + *output = (unsigned int)max_level; + return 0; + } - /*Scan the table in reverse order and match the level*/ - while (i >= 0) { + if (property == GET_FREQ) + level = descend ? input : (max_level - input - 1); + + for (i = 0, j = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) { + /* ignore invalid entry */ if (table[i].frequency == CPUFREQ_ENTRY_INVALID) continue; - /*return if level matched*/ - if (i == level_index) - return table[i].frequency; - i--; + + /* ignore duplicate entry */ + if (freq == table[i].frequency) + continue; + + /* now we have a valid frequency entry */ + freq = table[i].frequency; + + if (property == GET_LEVEL && (unsigned int)input == freq) { + /* get level by frequency */ + *output = descend ? j : (max_level - j - 1); + return 0; + } + if (property == GET_FREQ && level == j) { + /* get frequency by level */ + *output = freq; + return 0; + } + j++; } - return ret; + + return -EINVAL; +} + +/** + * cpufreq_cooling_get_level - for a give cpu, return the cooling level. + * @cpu: cpu for which the level is required + * @freq: the frequency of interest + * + * This function will match the cooling level corresponding to the + * requested @freq and return it. + * + * Return: The matched cooling level on success or THERMAL_CSTATE_INVALID + * otherwise. + */ +unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq) +{ + unsigned int val; + + if (get_property(cpu, (unsigned long)freq, &val, GET_LEVEL)) + return THERMAL_CSTATE_INVALID; + + return (unsigned long)val; +} +EXPORT_SYMBOL_GPL(cpufreq_cooling_get_level); + +/** + * get_cpu_frequency - get the absolute value of frequency from level. + * @cpu: cpu for which frequency is fetched. + * @level: cooling level + * + * This function matches cooling level with frequency. Based on a cooling level + * of frequency, equals cooling state of cpu cooling device, it will return + * the corresponding frequency. + * e.g level=0 --> 1st MAX FREQ, level=1 ---> 2nd MAX FREQ, .... etc + * + * Return: 0 on error, the corresponding frequency otherwise. + */ +static unsigned int get_cpu_frequency(unsigned int cpu, unsigned long level) +{ + int ret = 0; + unsigned int freq; + + ret = get_property(cpu, level, &freq, GET_FREQ); + if (ret) + return 0; + + return freq; } /** @@ -163,13 +262,19 @@ static unsigned int get_cpu_frequency(unsigned int cpu, unsigned long level) * @cpufreq_device: cpufreq_cooling_device pointer containing frequency * clipping data. * @cooling_state: value of the cooling state. + * + * Function used to make sure the cpufreq layer is aware of current thermal + * limits. The limits are applied by updating the cpufreq policy. + * + * Return: 0 on success, an error code otherwise (-EINVAL in case wrong + * cooling state). */ static int cpufreq_apply_cooling(struct cpufreq_cooling_device *cpufreq_device, - unsigned long cooling_state) + unsigned long cooling_state) { unsigned int cpuid, clip_freq; - struct cpumask *maskPtr = &cpufreq_device->allowed_cpus; - unsigned int cpu = cpumask_any(maskPtr); + struct cpumask *mask = &cpufreq_device->allowed_cpus; + unsigned int cpu = cpumask_any(mask); /* Check if the old cooling action is same as new cooling action */ @@ -184,7 +289,7 @@ static int cpufreq_apply_cooling(struct cpufreq_cooling_device *cpufreq_device, cpufreq_device->cpufreq_val = clip_freq; notify_device = cpufreq_device; - for_each_cpu(cpuid, maskPtr) { + for_each_cpu(cpuid, mask) { if (is_cpufreq_valid(cpuid)) cpufreq_update_policy(cpuid); } @@ -199,9 +304,15 @@ static int cpufreq_apply_cooling(struct cpufreq_cooling_device *cpufreq_device, * @nb: struct notifier_block * with callback info. * @event: value showing cpufreq event for which this function invoked. * @data: callback-specific data + * + * Callback to highjack the notification on cpufreq policy transition. + * Every time there is a change in policy, we will intercept and + * update the cpufreq policy with thermal constraints. + * + * Return: 0 (success) */ static int cpufreq_thermal_notifier(struct notifier_block *nb, - unsigned long event, void *data) + unsigned long event, void *data) { struct cpufreq_policy *policy = data; unsigned long max_freq = 0; @@ -212,7 +323,7 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb, if (cpumask_test_cpu(policy->cpu, ¬ify_device->allowed_cpus)) max_freq = notify_device->cpufreq_val; - /* Never exceed user_policy.max*/ + /* Never exceed user_policy.max */ if (max_freq > policy->user_policy.max) max_freq = policy->user_policy.max; @@ -222,50 +333,46 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb, return 0; } -/* - * cpufreq cooling device callback functions are defined below - */ +/* cpufreq cooling device callback functions are defined below */ /** * cpufreq_get_max_state - callback function to get the max cooling state. * @cdev: thermal cooling device pointer. * @state: fill this variable with the max cooling state. + * + * Callback for the thermal cooling device to return the cpufreq + * max cooling state. + * + * Return: 0 on success, an error code otherwise. */ static int cpufreq_get_max_state(struct thermal_cooling_device *cdev, unsigned long *state) { struct cpufreq_cooling_device *cpufreq_device = cdev->devdata; - struct cpumask *maskPtr = &cpufreq_device->allowed_cpus; + struct cpumask *mask = &cpufreq_device->allowed_cpus; unsigned int cpu; - struct cpufreq_frequency_table *table; - unsigned long count = 0; - int i = 0; - - cpu = cpumask_any(maskPtr); - table = cpufreq_frequency_get_table(cpu); - if (!table) { - *state = 0; - return 0; - } + unsigned int count = 0; + int ret; - for (i = 0; (table[i].frequency != CPUFREQ_TABLE_END); i++) { - if (table[i].frequency == CPUFREQ_ENTRY_INVALID) - continue; - count++; - } + cpu = cpumask_any(mask); - if (count > 0) { - *state = --count; - return 0; - } + ret = get_property(cpu, 0, &count, GET_MAXL); - return -EINVAL; + if (count > 0) + *state = count; + + return ret; } /** * cpufreq_get_cur_state - callback function to get the current cooling state. * @cdev: thermal cooling device pointer. * @state: fill this variable with the current cooling state. + * + * Callback for the thermal cooling device to return the cpufreq + * current cooling state. + * + * Return: 0 on success, an error code otherwise. */ static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev, unsigned long *state) @@ -273,6 +380,7 @@ static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev, struct cpufreq_cooling_device *cpufreq_device = cdev->devdata; *state = cpufreq_device->cpufreq_state; + return 0; } @@ -280,6 +388,11 @@ static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev, * cpufreq_set_cur_state - callback function to set the current cooling state. * @cdev: thermal cooling device pointer. * @state: set this variable to the current cooling state. + * + * Callback for the thermal cooling device to change the cpufreq + * current cooling state. + * + * Return: 0 on success, an error code otherwise. */ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, unsigned long state) @@ -304,9 +417,16 @@ static struct notifier_block thermal_cpufreq_notifier_block = { /** * cpufreq_cooling_register - function to create cpufreq cooling device. * @clip_cpus: cpumask of cpus where the frequency constraints will happen. + * + * This interface function registers the cpufreq cooling device with the name + * "thermal-cpufreq-%x". This api can support multiple instances of cpufreq + * cooling devices. + * + * Return: a valid struct thermal_cooling_device pointer on success, + * on failure, it returns a corresponding ERR_PTR(). */ -struct thermal_cooling_device *cpufreq_cooling_register( - const struct cpumask *clip_cpus) +struct thermal_cooling_device * +cpufreq_cooling_register(const struct cpumask *clip_cpus) { struct thermal_cooling_device *cool_dev; struct cpufreq_cooling_device *cpufreq_dev = NULL; @@ -315,9 +435,9 @@ struct thermal_cooling_device *cpufreq_cooling_register( int ret = 0, i; struct cpufreq_policy policy; - /*Verify that all the clip cpus have same freq_min, freq_max limit*/ + /* Verify that all the clip cpus have same freq_min, freq_max limit */ for_each_cpu(i, clip_cpus) { - /*continue if cpufreq policy not found and not return error*/ + /* continue if cpufreq policy not found and not return error */ if (!cpufreq_get_policy(&policy, i)) continue; if (min == 0 && max == 0) { @@ -325,12 +445,12 @@ struct thermal_cooling_device *cpufreq_cooling_register( max = policy.cpuinfo.max_freq; } else { if (min != policy.cpuinfo.min_freq || - max != policy.cpuinfo.max_freq) + max != policy.cpuinfo.max_freq) return ERR_PTR(-EINVAL); } } cpufreq_dev = kzalloc(sizeof(struct cpufreq_cooling_device), - GFP_KERNEL); + GFP_KERNEL); if (!cpufreq_dev) return ERR_PTR(-ENOMEM); @@ -342,10 +462,11 @@ struct thermal_cooling_device *cpufreq_cooling_register( return ERR_PTR(-EINVAL); } - sprintf(dev_name, "thermal-cpufreq-%d", cpufreq_dev->id); + snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d", + cpufreq_dev->id); cool_dev = thermal_cooling_device_register(dev_name, cpufreq_dev, - &cpufreq_cooling_ops); + &cpufreq_cooling_ops); if (!cool_dev) { release_idr(&cpufreq_idr, cpufreq_dev->id); kfree(cpufreq_dev); @@ -358,17 +479,20 @@ struct thermal_cooling_device *cpufreq_cooling_register( /* Register the notifier for first cpufreq cooling device */ if (cpufreq_dev_count == 0) cpufreq_register_notifier(&thermal_cpufreq_notifier_block, - CPUFREQ_POLICY_NOTIFIER); + CPUFREQ_POLICY_NOTIFIER); cpufreq_dev_count++; mutex_unlock(&cooling_cpufreq_lock); + return cool_dev; } -EXPORT_SYMBOL(cpufreq_cooling_register); +EXPORT_SYMBOL_GPL(cpufreq_cooling_register); /** * cpufreq_cooling_unregister - function to remove cpufreq cooling device. * @cdev: thermal cooling device pointer. + * + * This interface function unregisters the "thermal-cpufreq-%x" cooling device. */ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) { @@ -378,14 +502,13 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) cpufreq_dev_count--; /* Unregister the notifier for the last cpufreq cooling device */ - if (cpufreq_dev_count == 0) { + if (cpufreq_dev_count == 0) cpufreq_unregister_notifier(&thermal_cpufreq_notifier_block, - CPUFREQ_POLICY_NOTIFIER); - } + CPUFREQ_POLICY_NOTIFIER); mutex_unlock(&cooling_cpufreq_lock); thermal_cooling_device_unregister(cpufreq_dev->cool_dev); release_idr(&cpufreq_idr, cpufreq_dev->id); kfree(cpufreq_dev); } -EXPORT_SYMBOL(cpufreq_cooling_unregister); +EXPORT_SYMBOL_GPL(cpufreq_cooling_unregister); diff --git a/drivers/thermal/db8500_cpufreq_cooling.c b/drivers/thermal/db8500_cpufreq_cooling.c index 21419851fc02..786d19263ab0 100644 --- a/drivers/thermal/db8500_cpufreq_cooling.c +++ b/drivers/thermal/db8500_cpufreq_cooling.c @@ -37,7 +37,7 @@ static int db8500_cpufreq_cooling_probe(struct platform_device *pdev) cpumask_set_cpu(0, &mask_val); cdev = cpufreq_cooling_register(&mask_val); - if (IS_ERR_OR_NULL(cdev)) { + if (IS_ERR(cdev)) { dev_err(&pdev->dev, "Failed to register cooling device\n"); return PTR_ERR(cdev); } diff --git a/drivers/thermal/db8500_thermal.c b/drivers/thermal/db8500_thermal.c index 61ce60a35921..1e3b3bf9f993 100644 --- a/drivers/thermal/db8500_thermal.c +++ b/drivers/thermal/db8500_thermal.c @@ -419,7 +419,8 @@ static int db8500_thermal_probe(struct platform_device *pdev) low_irq = platform_get_irq_byname(pdev, "IRQ_HOTMON_LOW"); if (low_irq < 0) { dev_err(&pdev->dev, "Get IRQ_HOTMON_LOW failed.\n"); - return low_irq; + ret = low_irq; + goto out_unlock; } ret = devm_request_threaded_irq(&pdev->dev, low_irq, NULL, @@ -427,13 +428,14 @@ static int db8500_thermal_probe(struct platform_device *pdev) "dbx500_temp_low", pzone); if (ret < 0) { dev_err(&pdev->dev, "Failed to allocate temp low irq.\n"); - return ret; + goto out_unlock; } high_irq = platform_get_irq_byname(pdev, "IRQ_HOTMON_HIGH"); if (high_irq < 0) { dev_err(&pdev->dev, "Get IRQ_HOTMON_HIGH failed.\n"); - return high_irq; + ret = high_irq; + goto out_unlock; } ret = devm_request_threaded_irq(&pdev->dev, high_irq, NULL, @@ -441,15 +443,16 @@ static int db8500_thermal_probe(struct platform_device *pdev) "dbx500_temp_high", pzone); if (ret < 0) { dev_err(&pdev->dev, "Failed to allocate temp high irq.\n"); - return ret; + goto out_unlock; } pzone->therm_dev = thermal_zone_device_register("db8500_thermal_zone", ptrips->num_trips, 0, pzone, &thdev_ops, NULL, 0, 0); - if (IS_ERR_OR_NULL(pzone->therm_dev)) { + if (IS_ERR(pzone->therm_dev)) { dev_err(&pdev->dev, "Register thermal zone device failed.\n"); - return PTR_ERR(pzone->therm_dev); + ret = PTR_ERR(pzone->therm_dev); + goto out_unlock; } dev_info(&pdev->dev, "Thermal zone device registered.\n"); @@ -461,9 +464,11 @@ static int db8500_thermal_probe(struct platform_device *pdev) platform_set_drvdata(pdev, pzone); pzone->mode = THERMAL_DEVICE_ENABLED; + +out_unlock: mutex_unlock(&pzone->th_lock); - return 0; + return ret; } static int db8500_thermal_remove(struct platform_device *pdev) diff --git a/drivers/thermal/dove_thermal.c b/drivers/thermal/dove_thermal.c index 3078c403b42d..4b15a5f270dc 100644 --- a/drivers/thermal/dove_thermal.c +++ b/drivers/thermal/dove_thermal.c @@ -107,12 +107,13 @@ static int dove_get_temp(struct thermal_zone_device *thermal, } /* - * Calculate temperature. See Section 8.10.1 of 88AP510, - * Documentation/arm/Marvell/README + * Calculate temperature. According to Marvell internal + * documentation the formula for this is: + * Celsius = (322-reg)/1.3625 */ reg = readl_relaxed(priv->sensor); reg = (reg >> DOVE_THERMAL_TEMP_OFFSET) & DOVE_THERMAL_TEMP_MASK; - *temp = ((2281638UL - (7298*reg)) / 10); + *temp = ((3220000000UL - (10000000UL * reg)) / 13625); return 0; } diff --git a/drivers/thermal/exynos_thermal.c b/drivers/thermal/exynos_thermal.c index b777ae6f0a8f..d20ce9e61403 100644 --- a/drivers/thermal/exynos_thermal.c +++ b/drivers/thermal/exynos_thermal.c @@ -98,13 +98,13 @@ #define IDLE_INTERVAL 10000 #define MCELSIUS 1000 -#ifdef CONFIG_EXYNOS_THERMAL_EMUL +#ifdef CONFIG_THERMAL_EMULATION #define EXYNOS_EMUL_TIME 0x57F0 #define EXYNOS_EMUL_TIME_SHIFT 16 #define EXYNOS_EMUL_DATA_SHIFT 8 #define EXYNOS_EMUL_DATA_MASK 0xFF #define EXYNOS_EMUL_ENABLE 0x1 -#endif /* CONFIG_EXYNOS_THERMAL_EMUL */ +#endif /* CONFIG_THERMAL_EMULATION */ /* CPU Zone information */ #define PANIC_ZONE 4 @@ -143,6 +143,7 @@ struct thermal_cooling_conf { struct thermal_sensor_conf { char name[SENSOR_NAME_LEN]; int (*read_temperature)(void *data); + int (*write_emul_temp)(void *drv_data, unsigned long temp); struct thermal_trip_point_conf trip_data; struct thermal_cooling_conf cooling_data; void *private_data; @@ -240,26 +241,6 @@ static int exynos_get_crit_temp(struct thermal_zone_device *thermal, return ret; } -static int exynos_get_frequency_level(unsigned int cpu, unsigned int freq) -{ - int i = 0, ret = -EINVAL; - struct cpufreq_frequency_table *table = NULL; -#ifdef CONFIG_CPU_FREQ - table = cpufreq_frequency_get_table(cpu); -#endif - if (!table) - return ret; - - while (table[i].frequency != CPUFREQ_TABLE_END) { - if (table[i].frequency == CPUFREQ_ENTRY_INVALID) - continue; - if (table[i].frequency == freq) - return i; - i++; - } - return ret; -} - /* Bind callback functions for thermal zone */ static int exynos_bind(struct thermal_zone_device *thermal, struct thermal_cooling_device *cdev) @@ -286,8 +267,8 @@ static int exynos_bind(struct thermal_zone_device *thermal, /* Bind the thermal zone to the cpufreq cooling device */ for (i = 0; i < tab_size; i++) { clip_data = (struct freq_clip_table *)&(tab_ptr[i]); - level = exynos_get_frequency_level(0, clip_data->freq_clip_max); - if (level < 0) + level = cpufreq_cooling_get_level(0, clip_data->freq_clip_max); + if (level == THERMAL_CSTATE_INVALID) return 0; switch (GET_ZONE(i)) { case MONITOR_ZONE: @@ -367,6 +348,23 @@ static int exynos_get_temp(struct thermal_zone_device *thermal, return 0; } +/* Get temperature callback functions for thermal zone */ +static int exynos_set_emul_temp(struct thermal_zone_device *thermal, + unsigned long temp) +{ + void *data; + int ret = -EINVAL; + + if (!th_zone->sensor_conf) { + pr_info("Temperature sensor not initialised\n"); + return -EINVAL; + } + data = th_zone->sensor_conf->private_data; + if (th_zone->sensor_conf->write_emul_temp) + ret = th_zone->sensor_conf->write_emul_temp(data, temp); + return ret; +} + /* Get the temperature trend */ static int exynos_get_trend(struct thermal_zone_device *thermal, int trip, enum thermal_trend *trend) @@ -390,6 +388,7 @@ static struct thermal_zone_device_ops const exynos_dev_ops = { .bind = exynos_bind, .unbind = exynos_unbind, .get_temp = exynos_get_temp, + .set_emul_temp = exynos_set_emul_temp, .get_trend = exynos_get_trend, .get_mode = exynos_get_mode, .set_mode = exynos_set_mode, @@ -712,6 +711,47 @@ static int exynos_tmu_read(struct exynos_tmu_data *data) return temp; } +#ifdef CONFIG_THERMAL_EMULATION +static int exynos_tmu_set_emulation(void *drv_data, unsigned long temp) +{ + struct exynos_tmu_data *data = drv_data; + unsigned int reg; + int ret = -EINVAL; + + if (data->soc == SOC_ARCH_EXYNOS4210) + goto out; + + if (temp && temp < MCELSIUS) + goto out; + + mutex_lock(&data->lock); + clk_enable(data->clk); + + reg = readl(data->base + EXYNOS_EMUL_CON); + + if (temp) { + temp /= MCELSIUS; + + reg = (EXYNOS_EMUL_TIME << EXYNOS_EMUL_TIME_SHIFT) | + (temp_to_code(data, temp) + << EXYNOS_EMUL_DATA_SHIFT) | EXYNOS_EMUL_ENABLE; + } else { + reg &= ~EXYNOS_EMUL_ENABLE; + } + + writel(reg, data->base + EXYNOS_EMUL_CON); + + clk_disable(data->clk); + mutex_unlock(&data->lock); + return 0; +out: + return ret; +} +#else +static int exynos_tmu_set_emulation(void *drv_data, unsigned long temp) + { return -EINVAL; } +#endif/*CONFIG_THERMAL_EMULATION*/ + static void exynos_tmu_work(struct work_struct *work) { struct exynos_tmu_data *data = container_of(work, @@ -745,6 +785,7 @@ static irqreturn_t exynos_tmu_irq(int irq, void *id) static struct thermal_sensor_conf exynos_sensor_conf = { .name = "exynos-therm", .read_temperature = (int (*)(void *))exynos_tmu_read, + .write_emul_temp = exynos_tmu_set_emulation, }; #if defined(CONFIG_CPU_EXYNOS4210) @@ -813,6 +854,10 @@ static const struct of_device_id exynos_tmu_match[] = { .compatible = "samsung,exynos4210-tmu", .data = (void *)EXYNOS4210_TMU_DRV_DATA, }, + { + .compatible = "samsung,exynos4412-tmu", + .data = (void *)EXYNOS_TMU_DRV_DATA, + }, { .compatible = "samsung,exynos5250-tmu", .data = (void *)EXYNOS_TMU_DRV_DATA, @@ -851,93 +896,6 @@ static inline struct exynos_tmu_platform_data *exynos_get_driver_data( platform_get_device_id(pdev)->driver_data; } -#ifdef CONFIG_EXYNOS_THERMAL_EMUL -static ssize_t exynos_tmu_emulation_show(struct device *dev, - struct device_attribute *attr, - char *buf) -{ - struct platform_device *pdev = container_of(dev, - struct platform_device, dev); - struct exynos_tmu_data *data = platform_get_drvdata(pdev); - unsigned int reg; - u8 temp_code; - int temp = 0; - - if (data->soc == SOC_ARCH_EXYNOS4210) - goto out; - - mutex_lock(&data->lock); - clk_enable(data->clk); - reg = readl(data->base + EXYNOS_EMUL_CON); - clk_disable(data->clk); - mutex_unlock(&data->lock); - - if (reg & EXYNOS_EMUL_ENABLE) { - reg >>= EXYNOS_EMUL_DATA_SHIFT; - temp_code = reg & EXYNOS_EMUL_DATA_MASK; - temp = code_to_temp(data, temp_code); - } -out: - return sprintf(buf, "%d\n", temp * MCELSIUS); -} - -static ssize_t exynos_tmu_emulation_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t count) -{ - struct platform_device *pdev = container_of(dev, - struct platform_device, dev); - struct exynos_tmu_data *data = platform_get_drvdata(pdev); - unsigned int reg; - int temp; - - if (data->soc == SOC_ARCH_EXYNOS4210) - goto out; - - if (!sscanf(buf, "%d\n", &temp) || temp < 0) - return -EINVAL; - - mutex_lock(&data->lock); - clk_enable(data->clk); - - reg = readl(data->base + EXYNOS_EMUL_CON); - - if (temp) { - /* Both CELSIUS and MCELSIUS type are available for input */ - if (temp > MCELSIUS) - temp /= MCELSIUS; - - reg = (EXYNOS_EMUL_TIME << EXYNOS_EMUL_TIME_SHIFT) | - (temp_to_code(data, (temp / MCELSIUS)) - << EXYNOS_EMUL_DATA_SHIFT) | EXYNOS_EMUL_ENABLE; - } else { - reg &= ~EXYNOS_EMUL_ENABLE; - } - - writel(reg, data->base + EXYNOS_EMUL_CON); - - clk_disable(data->clk); - mutex_unlock(&data->lock); - -out: - return count; -} - -static DEVICE_ATTR(emulation, 0644, exynos_tmu_emulation_show, - exynos_tmu_emulation_store); -static int create_emulation_sysfs(struct device *dev) -{ - return device_create_file(dev, &dev_attr_emulation); -} -static void remove_emulation_sysfs(struct device *dev) -{ - device_remove_file(dev, &dev_attr_emulation); -} -#else -static inline int create_emulation_sysfs(struct device *dev) { return 0; } -static inline void remove_emulation_sysfs(struct device *dev) {} -#endif - static int exynos_tmu_probe(struct platform_device *pdev) { struct exynos_tmu_data *data; @@ -983,12 +941,16 @@ static int exynos_tmu_probe(struct platform_device *pdev) return ret; } - data->clk = clk_get(NULL, "tmu_apbif"); + data->clk = devm_clk_get(&pdev->dev, "tmu_apbif"); if (IS_ERR(data->clk)) { dev_err(&pdev->dev, "Failed to get clock\n"); return PTR_ERR(data->clk); } + ret = clk_prepare(data->clk); + if (ret) + return ret; + if (pdata->type == SOC_ARCH_EXYNOS || pdata->type == SOC_ARCH_EXYNOS4210) data->soc = pdata->type; @@ -1037,14 +999,10 @@ static int exynos_tmu_probe(struct platform_device *pdev) goto err_clk; } - ret = create_emulation_sysfs(&pdev->dev); - if (ret) - dev_err(&pdev->dev, "Failed to create emulation mode sysfs node\n"); - return 0; err_clk: platform_set_drvdata(pdev, NULL); - clk_put(data->clk); + clk_unprepare(data->clk); return ret; } @@ -1052,13 +1010,11 @@ static int exynos_tmu_remove(struct platform_device *pdev) { struct exynos_tmu_data *data = platform_get_drvdata(pdev); - remove_emulation_sysfs(&pdev->dev); - exynos_tmu_control(pdev, false); exynos_unregister_thermal(); - clk_put(data->clk); + clk_unprepare(data->clk); platform_set_drvdata(pdev, NULL); diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c index 792479f2b64b..944ba2f340c8 100644 --- a/drivers/thermal/fair_share.c +++ b/drivers/thermal/fair_share.c @@ -22,9 +22,6 @@ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include #include #include "thermal_core.h" @@ -111,23 +108,15 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip) static struct thermal_governor thermal_gov_fair_share = { .name = "fair_share", .throttle = fair_share_throttle, - .owner = THIS_MODULE, }; -static int __init thermal_gov_fair_share_init(void) +int thermal_gov_fair_share_register(void) { return thermal_register_governor(&thermal_gov_fair_share); } -static void __exit thermal_gov_fair_share_exit(void) +void thermal_gov_fair_share_unregister(void) { thermal_unregister_governor(&thermal_gov_fair_share); } -/* This should load after thermal framework */ -fs_initcall(thermal_gov_fair_share_init); -module_exit(thermal_gov_fair_share_exit); - -MODULE_AUTHOR("Durgadoss R"); -MODULE_DESCRIPTION("A simple weight based thermal throttling governor"); -MODULE_LICENSE("GPL"); diff --git a/drivers/thermal/kirkwood_thermal.c b/drivers/thermal/kirkwood_thermal.c index e5500edb5285..dfeceaffbc03 100644 --- a/drivers/thermal/kirkwood_thermal.c +++ b/drivers/thermal/kirkwood_thermal.c @@ -41,21 +41,21 @@ static int kirkwood_get_temp(struct thermal_zone_device *thermal, reg = readl_relaxed(priv->sensor); /* Valid check */ - if (!(reg >> KIRKWOOD_THERMAL_VALID_OFFSET) & - KIRKWOOD_THERMAL_VALID_MASK) { + if (!((reg >> KIRKWOOD_THERMAL_VALID_OFFSET) & + KIRKWOOD_THERMAL_VALID_MASK)) { dev_err(&thermal->device, "Temperature sensor reading not valid\n"); return -EIO; } /* - * Calculate temperature. See Section 8.10.1 of the 88AP510, - * datasheet, which has the same sensor. - * Documentation/arm/Marvell/README + * Calculate temperature. According to Marvell internal + * documentation the formula for this is: + * Celsius = (322-reg)/1.3625 */ reg = (reg >> KIRKWOOD_THERMAL_TEMP_OFFSET) & KIRKWOOD_THERMAL_TEMP_MASK; - *temp = ((2281638UL - (7298*reg)) / 10); + *temp = ((3220000000UL - (10000000UL * reg)) / 13625); return 0; } diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c index 2cc5b6115e3e..8d7edd4c8228 100644 --- a/drivers/thermal/rcar_thermal.c +++ b/drivers/thermal/rcar_thermal.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -377,6 +378,9 @@ static int rcar_thermal_probe(struct platform_device *pdev) spin_lock_init(&common->lock); common->dev = dev; + pm_runtime_enable(dev); + pm_runtime_get_sync(dev); + irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0); if (irq) { int ret; @@ -419,12 +423,15 @@ static int rcar_thermal_probe(struct platform_device *pdev) priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL); if (!priv) { dev_err(dev, "Could not allocate priv\n"); - return -ENOMEM; + ret = -ENOMEM; + goto error_unregister; } priv->base = devm_ioremap_resource(dev, res); - if (IS_ERR(priv->base)) - return PTR_ERR(priv->base); + if (IS_ERR(priv->base)) { + ret = PTR_ERR(priv->base); + goto error_unregister; + } priv->common = common; priv->id = i; @@ -443,10 +450,10 @@ static int rcar_thermal_probe(struct platform_device *pdev) goto error_unregister; } - list_move_tail(&priv->list, &common->head); - if (rcar_has_irq_support(priv)) rcar_thermal_irq_enable(priv); + + list_move_tail(&priv->list, &common->head); } platform_set_drvdata(pdev, common); @@ -456,8 +463,14 @@ static int rcar_thermal_probe(struct platform_device *pdev) return 0; error_unregister: - rcar_thermal_for_each_priv(priv, common) + rcar_thermal_for_each_priv(priv, common) { thermal_zone_device_unregister(priv->zone); + if (rcar_has_irq_support(priv)) + rcar_thermal_irq_disable(priv); + } + + pm_runtime_put_sync(dev); + pm_runtime_disable(dev); return ret; } @@ -465,13 +478,20 @@ error_unregister: static int rcar_thermal_remove(struct platform_device *pdev) { struct rcar_thermal_common *common = platform_get_drvdata(pdev); + struct device *dev = &pdev->dev; struct rcar_thermal_priv *priv; - rcar_thermal_for_each_priv(priv, common) + rcar_thermal_for_each_priv(priv, common) { thermal_zone_device_unregister(priv->zone); + if (rcar_has_irq_support(priv)) + rcar_thermal_irq_disable(priv); + } platform_set_drvdata(pdev, NULL); + pm_runtime_put_sync(dev); + pm_runtime_disable(dev); + return 0; } diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c index 407cde3211c1..4d4ddae1a991 100644 --- a/drivers/thermal/step_wise.c +++ b/drivers/thermal/step_wise.c @@ -22,9 +22,6 @@ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include #include #include "thermal_core.h" @@ -59,9 +56,12 @@ static unsigned long get_target_state(struct thermal_instance *instance, switch (trend) { case THERMAL_TREND_RAISING: - if (throttle) + if (throttle) { cur_state = cur_state < instance->upper ? (cur_state + 1) : instance->upper; + if (cur_state < instance->lower) + cur_state = instance->lower; + } break; case THERMAL_TREND_RAISE_FULL: if (throttle) @@ -71,8 +71,11 @@ static unsigned long get_target_state(struct thermal_instance *instance, if (cur_state == instance->lower) { if (!throttle) cur_state = -1; - } else + } else { cur_state -= 1; + if (cur_state > instance->upper) + cur_state = instance->upper; + } break; case THERMAL_TREND_DROP_FULL: if (cur_state == instance->lower) { @@ -180,23 +183,14 @@ static int step_wise_throttle(struct thermal_zone_device *tz, int trip) static struct thermal_governor thermal_gov_step_wise = { .name = "step_wise", .throttle = step_wise_throttle, - .owner = THIS_MODULE, }; -static int __init thermal_gov_step_wise_init(void) +int thermal_gov_step_wise_register(void) { return thermal_register_governor(&thermal_gov_step_wise); } -static void __exit thermal_gov_step_wise_exit(void) +void thermal_gov_step_wise_unregister(void) { thermal_unregister_governor(&thermal_gov_step_wise); } - -/* This should load after thermal framework */ -fs_initcall(thermal_gov_step_wise_init); -module_exit(thermal_gov_step_wise_exit); - -MODULE_AUTHOR("Durgadoss R"); -MODULE_DESCRIPTION("A step-by-step thermal throttling governor"); -MODULE_LICENSE("GPL"); diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c new file mode 100644 index 000000000000..d755440791b7 --- /dev/null +++ b/drivers/thermal/thermal_core.c @@ -0,0 +1,2011 @@ +/* + * thermal.c - Generic Thermal Management Sysfs support. + * + * Copyright (C) 2008 Intel Corp + * Copyright (C) 2008 Zhang Rui + * Copyright (C) 2008 Sujith Thomas + * + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; version 2 of the License. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + * + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "thermal_core.h" + +MODULE_AUTHOR("Zhang Rui"); +MODULE_DESCRIPTION("Generic thermal management sysfs support"); +MODULE_LICENSE("GPL v2"); + +static DEFINE_IDR(thermal_tz_idr); +static DEFINE_IDR(thermal_cdev_idr); +static DEFINE_MUTEX(thermal_idr_lock); + +static LIST_HEAD(thermal_tz_list); +static LIST_HEAD(thermal_cdev_list); +static LIST_HEAD(thermal_governor_list); + +static DEFINE_MUTEX(thermal_list_lock); +static DEFINE_MUTEX(thermal_governor_lock); + +static struct thermal_governor *__find_governor(const char *name) +{ + struct thermal_governor *pos; + + list_for_each_entry(pos, &thermal_governor_list, governor_list) + if (!strnicmp(name, pos->name, THERMAL_NAME_LENGTH)) + return pos; + + return NULL; +} + +int thermal_register_governor(struct thermal_governor *governor) +{ + int err; + const char *name; + struct thermal_zone_device *pos; + + if (!governor) + return -EINVAL; + + mutex_lock(&thermal_governor_lock); + + err = -EBUSY; + if (__find_governor(governor->name) == NULL) { + err = 0; + list_add(&governor->governor_list, &thermal_governor_list); + } + + mutex_lock(&thermal_list_lock); + + list_for_each_entry(pos, &thermal_tz_list, node) { + if (pos->governor) + continue; + if (pos->tzp) + name = pos->tzp->governor_name; + else + name = DEFAULT_THERMAL_GOVERNOR; + if (!strnicmp(name, governor->name, THERMAL_NAME_LENGTH)) + pos->governor = governor; + } + + mutex_unlock(&thermal_list_lock); + mutex_unlock(&thermal_governor_lock); + + return err; +} + +void thermal_unregister_governor(struct thermal_governor *governor) +{ + struct thermal_zone_device *pos; + + if (!governor) + return; + + mutex_lock(&thermal_governor_lock); + + if (__find_governor(governor->name) == NULL) + goto exit; + + mutex_lock(&thermal_list_lock); + + list_for_each_entry(pos, &thermal_tz_list, node) { + if (!strnicmp(pos->governor->name, governor->name, + THERMAL_NAME_LENGTH)) + pos->governor = NULL; + } + + mutex_unlock(&thermal_list_lock); + list_del(&governor->governor_list); +exit: + mutex_unlock(&thermal_governor_lock); + return; +} + +static int get_idr(struct idr *idr, struct mutex *lock, int *id) +{ + int ret; + + if (lock) + mutex_lock(lock); + ret = idr_alloc(idr, NULL, 0, 0, GFP_KERNEL); + if (lock) + mutex_unlock(lock); + if (unlikely(ret < 0)) + return ret; + *id = ret; + return 0; +} + +static void release_idr(struct idr *idr, struct mutex *lock, int id) +{ + if (lock) + mutex_lock(lock); + idr_remove(idr, id); + if (lock) + mutex_unlock(lock); +} + +int get_tz_trend(struct thermal_zone_device *tz, int trip) +{ + enum thermal_trend trend; + + if (!tz->ops->get_trend || tz->ops->get_trend(tz, trip, &trend)) { + if (tz->temperature > tz->last_temperature) + trend = THERMAL_TREND_RAISING; + else if (tz->temperature < tz->last_temperature) + trend = THERMAL_TREND_DROPPING; + else + trend = THERMAL_TREND_STABLE; + } + + return trend; +} +EXPORT_SYMBOL(get_tz_trend); + +struct thermal_instance *get_thermal_instance(struct thermal_zone_device *tz, + struct thermal_cooling_device *cdev, int trip) +{ + struct thermal_instance *pos = NULL; + struct thermal_instance *target_instance = NULL; + + mutex_lock(&tz->lock); + mutex_lock(&cdev->lock); + + list_for_each_entry(pos, &tz->thermal_instances, tz_node) { + if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) { + target_instance = pos; + break; + } + } + + mutex_unlock(&cdev->lock); + mutex_unlock(&tz->lock); + + return target_instance; +} +EXPORT_SYMBOL(get_thermal_instance); + +static void print_bind_err_msg(struct thermal_zone_device *tz, + struct thermal_cooling_device *cdev, int ret) +{ + dev_err(&tz->device, "binding zone %s with cdev %s failed:%d\n", + tz->type, cdev->type, ret); +} + +static void __bind(struct thermal_zone_device *tz, int mask, + struct thermal_cooling_device *cdev) +{ + int i, ret; + + for (i = 0; i < tz->trips; i++) { + if (mask & (1 << i)) { + ret = thermal_zone_bind_cooling_device(tz, i, cdev, + THERMAL_NO_LIMIT, THERMAL_NO_LIMIT); + if (ret) + print_bind_err_msg(tz, cdev, ret); + } + } +} + +static void __unbind(struct thermal_zone_device *tz, int mask, + struct thermal_cooling_device *cdev) +{ + int i; + + for (i = 0; i < tz->trips; i++) + if (mask & (1 << i)) + thermal_zone_unbind_cooling_device(tz, i, cdev); +} + +static void bind_cdev(struct thermal_cooling_device *cdev) +{ + int i, ret; + const struct thermal_zone_params *tzp; + struct thermal_zone_device *pos = NULL; + + mutex_lock(&thermal_list_lock); + + list_for_each_entry(pos, &thermal_tz_list, node) { + if (!pos->tzp && !pos->ops->bind) + continue; + + if (!pos->tzp && pos->ops->bind) { + ret = pos->ops->bind(pos, cdev); + if (ret) + print_bind_err_msg(pos, cdev, ret); + } + + tzp = pos->tzp; + if (!tzp || !tzp->tbp) + continue; + + for (i = 0; i < tzp->num_tbps; i++) { + if (tzp->tbp[i].cdev || !tzp->tbp[i].match) + continue; + if (tzp->tbp[i].match(pos, cdev)) + continue; + tzp->tbp[i].cdev = cdev; + __bind(pos, tzp->tbp[i].trip_mask, cdev); + } + } + + mutex_unlock(&thermal_list_lock); +} + +static void bind_tz(struct thermal_zone_device *tz) +{ + int i, ret; + struct thermal_cooling_device *pos = NULL; + const struct thermal_zone_params *tzp = tz->tzp; + + if (!tzp && !tz->ops->bind) + return; + + mutex_lock(&thermal_list_lock); + + /* If there is no platform data, try to use ops->bind */ + if (!tzp && tz->ops->bind) { + list_for_each_entry(pos, &thermal_cdev_list, node) { + ret = tz->ops->bind(tz, pos); + if (ret) + print_bind_err_msg(tz, pos, ret); + } + goto exit; + } + + if (!tzp || !tzp->tbp) + goto exit; + + list_for_each_entry(pos, &thermal_cdev_list, node) { + for (i = 0; i < tzp->num_tbps; i++) { + if (tzp->tbp[i].cdev || !tzp->tbp[i].match) + continue; + if (tzp->tbp[i].match(tz, pos)) + continue; + tzp->tbp[i].cdev = pos; + __bind(tz, tzp->tbp[i].trip_mask, pos); + } + } +exit: + mutex_unlock(&thermal_list_lock); +} + +static void thermal_zone_device_set_polling(struct thermal_zone_device *tz, + int delay) +{ + if (delay > 1000) + mod_delayed_work(system_freezable_wq, &tz->poll_queue, + round_jiffies(msecs_to_jiffies(delay))); + else if (delay) + mod_delayed_work(system_freezable_wq, &tz->poll_queue, + msecs_to_jiffies(delay)); + else + cancel_delayed_work(&tz->poll_queue); +} + +static void monitor_thermal_zone(struct thermal_zone_device *tz) +{ + mutex_lock(&tz->lock); + + if (tz->passive) + thermal_zone_device_set_polling(tz, tz->passive_delay); + else if (tz->polling_delay) + thermal_zone_device_set_polling(tz, tz->polling_delay); + else + thermal_zone_device_set_polling(tz, 0); + + mutex_unlock(&tz->lock); +} + +static void handle_non_critical_trips(struct thermal_zone_device *tz, + int trip, enum thermal_trip_type trip_type) +{ + if (tz->governor) + tz->governor->throttle(tz, trip); +} + +static void handle_critical_trips(struct thermal_zone_device *tz, + int trip, enum thermal_trip_type trip_type) +{ + long trip_temp; + + tz->ops->get_trip_temp(tz, trip, &trip_temp); + + /* If we have not crossed the trip_temp, we do not care. */ + if (tz->temperature < trip_temp) + return; + + if (tz->ops->notify) + tz->ops->notify(tz, trip, trip_type); + + if (trip_type == THERMAL_TRIP_CRITICAL) { + dev_emerg(&tz->device, + "critical temperature reached(%d C),shutting down\n", + tz->temperature / 1000); + orderly_poweroff(true); + } +} + +static void handle_thermal_trip(struct thermal_zone_device *tz, int trip) +{ + enum thermal_trip_type type; + + tz->ops->get_trip_type(tz, trip, &type); + + if (type == THERMAL_TRIP_CRITICAL || type == THERMAL_TRIP_HOT) + handle_critical_trips(tz, trip, type); + else + handle_non_critical_trips(tz, trip, type); + /* + * Alright, we handled this trip successfully. + * So, start monitoring again. + */ + monitor_thermal_zone(tz); +} + +/** + * thermal_zone_get_temp() - returns its the temperature of thermal zone + * @tz: a valid pointer to a struct thermal_zone_device + * @temp: a valid pointer to where to store the resulting temperature. + * + * When a valid thermal zone reference is passed, it will fetch its + * temperature and fill @temp. + * + * Return: On success returns 0, an error code otherwise + */ +int thermal_zone_get_temp(struct thermal_zone_device *tz, unsigned long *temp) +{ + int ret = -EINVAL; +#ifdef CONFIG_THERMAL_EMULATION + int count; + unsigned long crit_temp = -1UL; + enum thermal_trip_type type; +#endif + + if (!tz || IS_ERR(tz)) + goto exit; + + mutex_lock(&tz->lock); + + ret = tz->ops->get_temp(tz, temp); +#ifdef CONFIG_THERMAL_EMULATION + if (!tz->emul_temperature) + goto skip_emul; + + for (count = 0; count < tz->trips; count++) { + ret = tz->ops->get_trip_type(tz, count, &type); + if (!ret && type == THERMAL_TRIP_CRITICAL) { + ret = tz->ops->get_trip_temp(tz, count, &crit_temp); + break; + } + } + + if (ret) + goto skip_emul; + + if (*temp < crit_temp) + *temp = tz->emul_temperature; +skip_emul: +#endif + mutex_unlock(&tz->lock); +exit: + return ret; +} +EXPORT_SYMBOL_GPL(thermal_zone_get_temp); + +static void update_temperature(struct thermal_zone_device *tz) +{ + long temp; + int ret; + + ret = thermal_zone_get_temp(tz, &temp); + if (ret) { + dev_warn(&tz->device, "failed to read out thermal zone %d\n", + tz->id); + return; + } + + mutex_lock(&tz->lock); + tz->last_temperature = tz->temperature; + tz->temperature = temp; + mutex_unlock(&tz->lock); +} + +void thermal_zone_device_update(struct thermal_zone_device *tz) +{ + int count; + + update_temperature(tz); + + for (count = 0; count < tz->trips; count++) + handle_thermal_trip(tz, count); +} +EXPORT_SYMBOL_GPL(thermal_zone_device_update); + +static void thermal_zone_device_check(struct work_struct *work) +{ + struct thermal_zone_device *tz = container_of(work, struct + thermal_zone_device, + poll_queue.work); + thermal_zone_device_update(tz); +} + +/* sys I/F for thermal zone */ + +#define to_thermal_zone(_dev) \ + container_of(_dev, struct thermal_zone_device, device) + +static ssize_t +type_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + + return sprintf(buf, "%s\n", tz->type); +} + +static ssize_t +temp_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + long temperature; + int ret; + + ret = thermal_zone_get_temp(tz, &temperature); + + if (ret) + return ret; + + return sprintf(buf, "%ld\n", temperature); +} + +static ssize_t +mode_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + enum thermal_device_mode mode; + int result; + + if (!tz->ops->get_mode) + return -EPERM; + + result = tz->ops->get_mode(tz, &mode); + if (result) + return result; + + return sprintf(buf, "%s\n", mode == THERMAL_DEVICE_ENABLED ? "enabled" + : "disabled"); +} + +static ssize_t +mode_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + int result; + + if (!tz->ops->set_mode) + return -EPERM; + + if (!strncmp(buf, "enabled", sizeof("enabled") - 1)) + result = tz->ops->set_mode(tz, THERMAL_DEVICE_ENABLED); + else if (!strncmp(buf, "disabled", sizeof("disabled") - 1)) + result = tz->ops->set_mode(tz, THERMAL_DEVICE_DISABLED); + else + result = -EINVAL; + + if (result) + return result; + + return count; +} + +static ssize_t +trip_point_type_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + enum thermal_trip_type type; + int trip, result; + + if (!tz->ops->get_trip_type) + return -EPERM; + + if (!sscanf(attr->attr.name, "trip_point_%d_type", &trip)) + return -EINVAL; + + result = tz->ops->get_trip_type(tz, trip, &type); + if (result) + return result; + + switch (type) { + case THERMAL_TRIP_CRITICAL: + return sprintf(buf, "critical\n"); + case THERMAL_TRIP_HOT: + return sprintf(buf, "hot\n"); + case THERMAL_TRIP_PASSIVE: + return sprintf(buf, "passive\n"); + case THERMAL_TRIP_ACTIVE: + return sprintf(buf, "active\n"); + default: + return sprintf(buf, "unknown\n"); + } +} + +static ssize_t +trip_point_temp_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + int trip, ret; + unsigned long temperature; + + if (!tz->ops->set_trip_temp) + return -EPERM; + + if (!sscanf(attr->attr.name, "trip_point_%d_temp", &trip)) + return -EINVAL; + + if (kstrtoul(buf, 10, &temperature)) + return -EINVAL; + + ret = tz->ops->set_trip_temp(tz, trip, temperature); + + return ret ? ret : count; +} + +static ssize_t +trip_point_temp_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + int trip, ret; + long temperature; + + if (!tz->ops->get_trip_temp) + return -EPERM; + + if (!sscanf(attr->attr.name, "trip_point_%d_temp", &trip)) + return -EINVAL; + + ret = tz->ops->get_trip_temp(tz, trip, &temperature); + + if (ret) + return ret; + + return sprintf(buf, "%ld\n", temperature); +} + +static ssize_t +trip_point_hyst_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + int trip, ret; + unsigned long temperature; + + if (!tz->ops->set_trip_hyst) + return -EPERM; + + if (!sscanf(attr->attr.name, "trip_point_%d_hyst", &trip)) + return -EINVAL; + + if (kstrtoul(buf, 10, &temperature)) + return -EINVAL; + + /* + * We are not doing any check on the 'temperature' value + * here. The driver implementing 'set_trip_hyst' has to + * take care of this. + */ + ret = tz->ops->set_trip_hyst(tz, trip, temperature); + + return ret ? ret : count; +} + +static ssize_t +trip_point_hyst_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + int trip, ret; + unsigned long temperature; + + if (!tz->ops->get_trip_hyst) + return -EPERM; + + if (!sscanf(attr->attr.name, "trip_point_%d_hyst", &trip)) + return -EINVAL; + + ret = tz->ops->get_trip_hyst(tz, trip, &temperature); + + return ret ? ret : sprintf(buf, "%ld\n", temperature); +} + +static ssize_t +passive_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + struct thermal_cooling_device *cdev = NULL; + int state; + + if (!sscanf(buf, "%d\n", &state)) + return -EINVAL; + + /* sanity check: values below 1000 millicelcius don't make sense + * and can cause the system to go into a thermal heart attack + */ + if (state && state < 1000) + return -EINVAL; + + if (state && !tz->forced_passive) { + mutex_lock(&thermal_list_lock); + list_for_each_entry(cdev, &thermal_cdev_list, node) { + if (!strncmp("Processor", cdev->type, + sizeof("Processor"))) + thermal_zone_bind_cooling_device(tz, + THERMAL_TRIPS_NONE, cdev, + THERMAL_NO_LIMIT, + THERMAL_NO_LIMIT); + } + mutex_unlock(&thermal_list_lock); + if (!tz->passive_delay) + tz->passive_delay = 1000; + } else if (!state && tz->forced_passive) { + mutex_lock(&thermal_list_lock); + list_for_each_entry(cdev, &thermal_cdev_list, node) { + if (!strncmp("Processor", cdev->type, + sizeof("Processor"))) + thermal_zone_unbind_cooling_device(tz, + THERMAL_TRIPS_NONE, + cdev); + } + mutex_unlock(&thermal_list_lock); + tz->passive_delay = 0; + } + + tz->forced_passive = state; + + thermal_zone_device_update(tz); + + return count; +} + +static ssize_t +passive_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + + return sprintf(buf, "%d\n", tz->forced_passive); +} + +static ssize_t +policy_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + int ret = -EINVAL; + struct thermal_zone_device *tz = to_thermal_zone(dev); + struct thermal_governor *gov; + + mutex_lock(&thermal_governor_lock); + + gov = __find_governor(buf); + if (!gov) + goto exit; + + tz->governor = gov; + ret = count; + +exit: + mutex_unlock(&thermal_governor_lock); + return ret; +} + +static ssize_t +policy_show(struct device *dev, struct device_attribute *devattr, char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + + return sprintf(buf, "%s\n", tz->governor->name); +} + +#ifdef CONFIG_THERMAL_EMULATION +static ssize_t +emul_temp_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + int ret = 0; + unsigned long temperature; + + if (kstrtoul(buf, 10, &temperature)) + return -EINVAL; + + if (!tz->ops->set_emul_temp) { + mutex_lock(&tz->lock); + tz->emul_temperature = temperature; + mutex_unlock(&tz->lock); + } else { + ret = tz->ops->set_emul_temp(tz, temperature); + } + + return ret ? ret : count; +} +static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store); +#endif/*CONFIG_THERMAL_EMULATION*/ + +static DEVICE_ATTR(type, 0444, type_show, NULL); +static DEVICE_ATTR(temp, 0444, temp_show, NULL); +static DEVICE_ATTR(mode, 0644, mode_show, mode_store); +static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store); +static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store); + +/* sys I/F for cooling device */ +#define to_cooling_device(_dev) \ + container_of(_dev, struct thermal_cooling_device, device) + +static ssize_t +thermal_cooling_device_type_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct thermal_cooling_device *cdev = to_cooling_device(dev); + + return sprintf(buf, "%s\n", cdev->type); +} + +static ssize_t +thermal_cooling_device_max_state_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct thermal_cooling_device *cdev = to_cooling_device(dev); + unsigned long state; + int ret; + + ret = cdev->ops->get_max_state(cdev, &state); + if (ret) + return ret; + return sprintf(buf, "%ld\n", state); +} + +static ssize_t +thermal_cooling_device_cur_state_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct thermal_cooling_device *cdev = to_cooling_device(dev); + unsigned long state; + int ret; + + ret = cdev->ops->get_cur_state(cdev, &state); + if (ret) + return ret; + return sprintf(buf, "%ld\n", state); +} + +static ssize_t +thermal_cooling_device_cur_state_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct thermal_cooling_device *cdev = to_cooling_device(dev); + unsigned long state; + int result; + + if (!sscanf(buf, "%ld\n", &state)) + return -EINVAL; + + if ((long)state < 0) + return -EINVAL; + + result = cdev->ops->set_cur_state(cdev, state); + if (result) + return result; + return count; +} + +static struct device_attribute dev_attr_cdev_type = +__ATTR(type, 0444, thermal_cooling_device_type_show, NULL); +static DEVICE_ATTR(max_state, 0444, + thermal_cooling_device_max_state_show, NULL); +static DEVICE_ATTR(cur_state, 0644, + thermal_cooling_device_cur_state_show, + thermal_cooling_device_cur_state_store); + +static ssize_t +thermal_cooling_device_trip_point_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct thermal_instance *instance; + + instance = + container_of(attr, struct thermal_instance, attr); + + if (instance->trip == THERMAL_TRIPS_NONE) + return sprintf(buf, "-1\n"); + else + return sprintf(buf, "%d\n", instance->trip); +} + +/* Device management */ + +#if defined(CONFIG_THERMAL_HWMON) + +/* hwmon sys I/F */ +#include + +/* thermal zone devices with the same type share one hwmon device */ +struct thermal_hwmon_device { + char type[THERMAL_NAME_LENGTH]; + struct device *device; + int count; + struct list_head tz_list; + struct list_head node; +}; + +struct thermal_hwmon_attr { + struct device_attribute attr; + char name[16]; +}; + +/* one temperature input for each thermal zone */ +struct thermal_hwmon_temp { + struct list_head hwmon_node; + struct thermal_zone_device *tz; + struct thermal_hwmon_attr temp_input; /* hwmon sys attr */ + struct thermal_hwmon_attr temp_crit; /* hwmon sys attr */ +}; + +static LIST_HEAD(thermal_hwmon_list); + +static ssize_t +name_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct thermal_hwmon_device *hwmon = dev_get_drvdata(dev); + return sprintf(buf, "%s\n", hwmon->type); +} +static DEVICE_ATTR(name, 0444, name_show, NULL); + +static ssize_t +temp_input_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + long temperature; + int ret; + struct thermal_hwmon_attr *hwmon_attr + = container_of(attr, struct thermal_hwmon_attr, attr); + struct thermal_hwmon_temp *temp + = container_of(hwmon_attr, struct thermal_hwmon_temp, + temp_input); + struct thermal_zone_device *tz = temp->tz; + + ret = thermal_zone_get_temp(tz, &temperature); + + if (ret) + return ret; + + return sprintf(buf, "%ld\n", temperature); +} + +static ssize_t +temp_crit_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct thermal_hwmon_attr *hwmon_attr + = container_of(attr, struct thermal_hwmon_attr, attr); + struct thermal_hwmon_temp *temp + = container_of(hwmon_attr, struct thermal_hwmon_temp, + temp_crit); + struct thermal_zone_device *tz = temp->tz; + long temperature; + int ret; + + ret = tz->ops->get_trip_temp(tz, 0, &temperature); + if (ret) + return ret; + + return sprintf(buf, "%ld\n", temperature); +} + + +static struct thermal_hwmon_device * +thermal_hwmon_lookup_by_type(const struct thermal_zone_device *tz) +{ + struct thermal_hwmon_device *hwmon; + + mutex_lock(&thermal_list_lock); + list_for_each_entry(hwmon, &thermal_hwmon_list, node) + if (!strcmp(hwmon->type, tz->type)) { + mutex_unlock(&thermal_list_lock); + return hwmon; + } + mutex_unlock(&thermal_list_lock); + + return NULL; +} + +/* Find the temperature input matching a given thermal zone */ +static struct thermal_hwmon_temp * +thermal_hwmon_lookup_temp(const struct thermal_hwmon_device *hwmon, + const struct thermal_zone_device *tz) +{ + struct thermal_hwmon_temp *temp; + + mutex_lock(&thermal_list_lock); + list_for_each_entry(temp, &hwmon->tz_list, hwmon_node) + if (temp->tz == tz) { + mutex_unlock(&thermal_list_lock); + return temp; + } + mutex_unlock(&thermal_list_lock); + + return NULL; +} + +static int +thermal_add_hwmon_sysfs(struct thermal_zone_device *tz) +{ + struct thermal_hwmon_device *hwmon; + struct thermal_hwmon_temp *temp; + int new_hwmon_device = 1; + int result; + + hwmon = thermal_hwmon_lookup_by_type(tz); + if (hwmon) { + new_hwmon_device = 0; + goto register_sys_interface; + } + + hwmon = kzalloc(sizeof(struct thermal_hwmon_device), GFP_KERNEL); + if (!hwmon) + return -ENOMEM; + + INIT_LIST_HEAD(&hwmon->tz_list); + strlcpy(hwmon->type, tz->type, THERMAL_NAME_LENGTH); + hwmon->device = hwmon_device_register(NULL); + if (IS_ERR(hwmon->device)) { + result = PTR_ERR(hwmon->device); + goto free_mem; + } + dev_set_drvdata(hwmon->device, hwmon); + result = device_create_file(hwmon->device, &dev_attr_name); + if (result) + goto free_mem; + + register_sys_interface: + temp = kzalloc(sizeof(struct thermal_hwmon_temp), GFP_KERNEL); + if (!temp) { + result = -ENOMEM; + goto unregister_name; + } + + temp->tz = tz; + hwmon->count++; + + snprintf(temp->temp_input.name, sizeof(temp->temp_input.name), + "temp%d_input", hwmon->count); + temp->temp_input.attr.attr.name = temp->temp_input.name; + temp->temp_input.attr.attr.mode = 0444; + temp->temp_input.attr.show = temp_input_show; + sysfs_attr_init(&temp->temp_input.attr.attr); + result = device_create_file(hwmon->device, &temp->temp_input.attr); + if (result) + goto free_temp_mem; + + if (tz->ops->get_crit_temp) { + unsigned long temperature; + if (!tz->ops->get_crit_temp(tz, &temperature)) { + snprintf(temp->temp_crit.name, + sizeof(temp->temp_crit.name), + "temp%d_crit", hwmon->count); + temp->temp_crit.attr.attr.name = temp->temp_crit.name; + temp->temp_crit.attr.attr.mode = 0444; + temp->temp_crit.attr.show = temp_crit_show; + sysfs_attr_init(&temp->temp_crit.attr.attr); + result = device_create_file(hwmon->device, + &temp->temp_crit.attr); + if (result) + goto unregister_input; + } + } + + mutex_lock(&thermal_list_lock); + if (new_hwmon_device) + list_add_tail(&hwmon->node, &thermal_hwmon_list); + list_add_tail(&temp->hwmon_node, &hwmon->tz_list); + mutex_unlock(&thermal_list_lock); + + return 0; + + unregister_input: + device_remove_file(hwmon->device, &temp->temp_input.attr); + free_temp_mem: + kfree(temp); + unregister_name: + if (new_hwmon_device) { + device_remove_file(hwmon->device, &dev_attr_name); + hwmon_device_unregister(hwmon->device); + } + free_mem: + if (new_hwmon_device) + kfree(hwmon); + + return result; +} + +static void +thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz) +{ + struct thermal_hwmon_device *hwmon; + struct thermal_hwmon_temp *temp; + + hwmon = thermal_hwmon_lookup_by_type(tz); + if (unlikely(!hwmon)) { + /* Should never happen... */ + dev_dbg(&tz->device, "hwmon device lookup failed!\n"); + return; + } + + temp = thermal_hwmon_lookup_temp(hwmon, tz); + if (unlikely(!temp)) { + /* Should never happen... */ + dev_dbg(&tz->device, "temperature input lookup failed!\n"); + return; + } + + device_remove_file(hwmon->device, &temp->temp_input.attr); + if (tz->ops->get_crit_temp) + device_remove_file(hwmon->device, &temp->temp_crit.attr); + + mutex_lock(&thermal_list_lock); + list_del(&temp->hwmon_node); + kfree(temp); + if (!list_empty(&hwmon->tz_list)) { + mutex_unlock(&thermal_list_lock); + return; + } + list_del(&hwmon->node); + mutex_unlock(&thermal_list_lock); + + device_remove_file(hwmon->device, &dev_attr_name); + hwmon_device_unregister(hwmon->device); + kfree(hwmon); +} +#else +static int +thermal_add_hwmon_sysfs(struct thermal_zone_device *tz) +{ + return 0; +} + +static void +thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz) +{ +} +#endif + +/** + * thermal_zone_bind_cooling_device() - bind a cooling device to a thermal zone + * @tz: pointer to struct thermal_zone_device + * @trip: indicates which trip point the cooling devices is + * associated with in this thermal zone. + * @cdev: pointer to struct thermal_cooling_device + * @upper: the Maximum cooling state for this trip point. + * THERMAL_NO_LIMIT means no upper limit, + * and the cooling device can be in max_state. + * @lower: the Minimum cooling state can be used for this trip point. + * THERMAL_NO_LIMIT means no lower limit, + * and the cooling device can be in cooling state 0. + * + * This interface function bind a thermal cooling device to the certain trip + * point of a thermal zone device. + * This function is usually called in the thermal zone device .bind callback. + * + * Return: 0 on success, the proper error value otherwise. + */ +int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz, + int trip, + struct thermal_cooling_device *cdev, + unsigned long upper, unsigned long lower) +{ + struct thermal_instance *dev; + struct thermal_instance *pos; + struct thermal_zone_device *pos1; + struct thermal_cooling_device *pos2; + unsigned long max_state; + int result; + + if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE)) + return -EINVAL; + + list_for_each_entry(pos1, &thermal_tz_list, node) { + if (pos1 == tz) + break; + } + list_for_each_entry(pos2, &thermal_cdev_list, node) { + if (pos2 == cdev) + break; + } + + if (tz != pos1 || cdev != pos2) + return -EINVAL; + + cdev->ops->get_max_state(cdev, &max_state); + + /* lower default 0, upper default max_state */ + lower = lower == THERMAL_NO_LIMIT ? 0 : lower; + upper = upper == THERMAL_NO_LIMIT ? max_state : upper; + + if (lower > upper || upper > max_state) + return -EINVAL; + + dev = + kzalloc(sizeof(struct thermal_instance), GFP_KERNEL); + if (!dev) + return -ENOMEM; + dev->tz = tz; + dev->cdev = cdev; + dev->trip = trip; + dev->upper = upper; + dev->lower = lower; + dev->target = THERMAL_NO_TARGET; + + result = get_idr(&tz->idr, &tz->lock, &dev->id); + if (result) + goto free_mem; + + sprintf(dev->name, "cdev%d", dev->id); + result = + sysfs_create_link(&tz->device.kobj, &cdev->device.kobj, dev->name); + if (result) + goto release_idr; + + sprintf(dev->attr_name, "cdev%d_trip_point", dev->id); + sysfs_attr_init(&dev->attr.attr); + dev->attr.attr.name = dev->attr_name; + dev->attr.attr.mode = 0444; + dev->attr.show = thermal_cooling_device_trip_point_show; + result = device_create_file(&tz->device, &dev->attr); + if (result) + goto remove_symbol_link; + + mutex_lock(&tz->lock); + mutex_lock(&cdev->lock); + list_for_each_entry(pos, &tz->thermal_instances, tz_node) + if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) { + result = -EEXIST; + break; + } + if (!result) { + list_add_tail(&dev->tz_node, &tz->thermal_instances); + list_add_tail(&dev->cdev_node, &cdev->thermal_instances); + } + mutex_unlock(&cdev->lock); + mutex_unlock(&tz->lock); + + if (!result) + return 0; + + device_remove_file(&tz->device, &dev->attr); +remove_symbol_link: + sysfs_remove_link(&tz->device.kobj, dev->name); +release_idr: + release_idr(&tz->idr, &tz->lock, dev->id); +free_mem: + kfree(dev); + return result; +} +EXPORT_SYMBOL_GPL(thermal_zone_bind_cooling_device); + +/** + * thermal_zone_unbind_cooling_device() - unbind a cooling device from a + * thermal zone. + * @tz: pointer to a struct thermal_zone_device. + * @trip: indicates which trip point the cooling devices is + * associated with in this thermal zone. + * @cdev: pointer to a struct thermal_cooling_device. + * + * This interface function unbind a thermal cooling device from the certain + * trip point of a thermal zone device. + * This function is usually called in the thermal zone device .unbind callback. + * + * Return: 0 on success, the proper error value otherwise. + */ +int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz, + int trip, + struct thermal_cooling_device *cdev) +{ + struct thermal_instance *pos, *next; + + mutex_lock(&tz->lock); + mutex_lock(&cdev->lock); + list_for_each_entry_safe(pos, next, &tz->thermal_instances, tz_node) { + if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) { + list_del(&pos->tz_node); + list_del(&pos->cdev_node); + mutex_unlock(&cdev->lock); + mutex_unlock(&tz->lock); + goto unbind; + } + } + mutex_unlock(&cdev->lock); + mutex_unlock(&tz->lock); + + return -ENODEV; + +unbind: + device_remove_file(&tz->device, &pos->attr); + sysfs_remove_link(&tz->device.kobj, pos->name); + release_idr(&tz->idr, &tz->lock, pos->id); + kfree(pos); + return 0; +} +EXPORT_SYMBOL_GPL(thermal_zone_unbind_cooling_device); + +static void thermal_release(struct device *dev) +{ + struct thermal_zone_device *tz; + struct thermal_cooling_device *cdev; + + if (!strncmp(dev_name(dev), "thermal_zone", + sizeof("thermal_zone") - 1)) { + tz = to_thermal_zone(dev); + kfree(tz); + } else { + cdev = to_cooling_device(dev); + kfree(cdev); + } +} + +static struct class thermal_class = { + .name = "thermal", + .dev_release = thermal_release, +}; + +/** + * thermal_cooling_device_register() - register a new thermal cooling device + * @type: the thermal cooling device type. + * @devdata: device private data. + * @ops: standard thermal cooling devices callbacks. + * + * This interface function adds a new thermal cooling device (fan/processor/...) + * to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself + * to all the thermal zone devices registered at the same time. + * + * Return: a pointer to the created struct thermal_cooling_device or an + * ERR_PTR. Caller must check return value with IS_ERR*() helpers. + */ +struct thermal_cooling_device * +thermal_cooling_device_register(char *type, void *devdata, + const struct thermal_cooling_device_ops *ops) +{ + struct thermal_cooling_device *cdev; + int result; + + if (type && strlen(type) >= THERMAL_NAME_LENGTH) + return ERR_PTR(-EINVAL); + + if (!ops || !ops->get_max_state || !ops->get_cur_state || + !ops->set_cur_state) + return ERR_PTR(-EINVAL); + + cdev = kzalloc(sizeof(struct thermal_cooling_device), GFP_KERNEL); + if (!cdev) + return ERR_PTR(-ENOMEM); + + result = get_idr(&thermal_cdev_idr, &thermal_idr_lock, &cdev->id); + if (result) { + kfree(cdev); + return ERR_PTR(result); + } + + strlcpy(cdev->type, type ? : "", sizeof(cdev->type)); + mutex_init(&cdev->lock); + INIT_LIST_HEAD(&cdev->thermal_instances); + cdev->ops = ops; + cdev->updated = true; + cdev->device.class = &thermal_class; + cdev->devdata = devdata; + dev_set_name(&cdev->device, "cooling_device%d", cdev->id); + result = device_register(&cdev->device); + if (result) { + release_idr(&thermal_cdev_idr, &thermal_idr_lock, cdev->id); + kfree(cdev); + return ERR_PTR(result); + } + + /* sys I/F */ + if (type) { + result = device_create_file(&cdev->device, &dev_attr_cdev_type); + if (result) + goto unregister; + } + + result = device_create_file(&cdev->device, &dev_attr_max_state); + if (result) + goto unregister; + + result = device_create_file(&cdev->device, &dev_attr_cur_state); + if (result) + goto unregister; + + /* Add 'this' new cdev to the global cdev list */ + mutex_lock(&thermal_list_lock); + list_add(&cdev->node, &thermal_cdev_list); + mutex_unlock(&thermal_list_lock); + + /* Update binding information for 'this' new cdev */ + bind_cdev(cdev); + + return cdev; + +unregister: + release_idr(&thermal_cdev_idr, &thermal_idr_lock, cdev->id); + device_unregister(&cdev->device); + return ERR_PTR(result); +} +EXPORT_SYMBOL_GPL(thermal_cooling_device_register); + +/** + * thermal_cooling_device_unregister - removes the registered thermal cooling device + * @cdev: the thermal cooling device to remove. + * + * thermal_cooling_device_unregister() must be called when the device is no + * longer needed. + */ +void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) +{ + int i; + const struct thermal_zone_params *tzp; + struct thermal_zone_device *tz; + struct thermal_cooling_device *pos = NULL; + + if (!cdev) + return; + + mutex_lock(&thermal_list_lock); + list_for_each_entry(pos, &thermal_cdev_list, node) + if (pos == cdev) + break; + if (pos != cdev) { + /* thermal cooling device not found */ + mutex_unlock(&thermal_list_lock); + return; + } + list_del(&cdev->node); + + /* Unbind all thermal zones associated with 'this' cdev */ + list_for_each_entry(tz, &thermal_tz_list, node) { + if (tz->ops->unbind) { + tz->ops->unbind(tz, cdev); + continue; + } + + if (!tz->tzp || !tz->tzp->tbp) + continue; + + tzp = tz->tzp; + for (i = 0; i < tzp->num_tbps; i++) { + if (tzp->tbp[i].cdev == cdev) { + __unbind(tz, tzp->tbp[i].trip_mask, cdev); + tzp->tbp[i].cdev = NULL; + } + } + } + + mutex_unlock(&thermal_list_lock); + + if (cdev->type[0]) + device_remove_file(&cdev->device, &dev_attr_cdev_type); + device_remove_file(&cdev->device, &dev_attr_max_state); + device_remove_file(&cdev->device, &dev_attr_cur_state); + + release_idr(&thermal_cdev_idr, &thermal_idr_lock, cdev->id); + device_unregister(&cdev->device); + return; +} +EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister); + +void thermal_cdev_update(struct thermal_cooling_device *cdev) +{ + struct thermal_instance *instance; + unsigned long target = 0; + + /* cooling device is updated*/ + if (cdev->updated) + return; + + mutex_lock(&cdev->lock); + /* Make sure cdev enters the deepest cooling state */ + list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) { + if (instance->target == THERMAL_NO_TARGET) + continue; + if (instance->target > target) + target = instance->target; + } + mutex_unlock(&cdev->lock); + cdev->ops->set_cur_state(cdev, target); + cdev->updated = true; +} +EXPORT_SYMBOL(thermal_cdev_update); + +/** + * thermal_notify_framework - Sensor drivers use this API to notify framework + * @tz: thermal zone device + * @trip: indicates which trip point has been crossed + * + * This function handles the trip events from sensor drivers. It starts + * throttling the cooling devices according to the policy configured. + * For CRITICAL and HOT trip points, this notifies the respective drivers, + * and does actual throttling for other trip points i.e ACTIVE and PASSIVE. + * The throttling policy is based on the configured platform data; if no + * platform data is provided, this uses the step_wise throttling policy. + */ +void thermal_notify_framework(struct thermal_zone_device *tz, int trip) +{ + handle_thermal_trip(tz, trip); +} +EXPORT_SYMBOL_GPL(thermal_notify_framework); + +/** + * create_trip_attrs() - create attributes for trip points + * @tz: the thermal zone device + * @mask: Writeable trip point bitmap. + * + * helper function to instantiate sysfs entries for every trip + * point and its properties of a struct thermal_zone_device. + * + * Return: 0 on success, the proper error value otherwise. + */ +static int create_trip_attrs(struct thermal_zone_device *tz, int mask) +{ + int indx; + int size = sizeof(struct thermal_attr) * tz->trips; + + tz->trip_type_attrs = kzalloc(size, GFP_KERNEL); + if (!tz->trip_type_attrs) + return -ENOMEM; + + tz->trip_temp_attrs = kzalloc(size, GFP_KERNEL); + if (!tz->trip_temp_attrs) { + kfree(tz->trip_type_attrs); + return -ENOMEM; + } + + if (tz->ops->get_trip_hyst) { + tz->trip_hyst_attrs = kzalloc(size, GFP_KERNEL); + if (!tz->trip_hyst_attrs) { + kfree(tz->trip_type_attrs); + kfree(tz->trip_temp_attrs); + return -ENOMEM; + } + } + + + for (indx = 0; indx < tz->trips; indx++) { + /* create trip type attribute */ + snprintf(tz->trip_type_attrs[indx].name, THERMAL_NAME_LENGTH, + "trip_point_%d_type", indx); + + sysfs_attr_init(&tz->trip_type_attrs[indx].attr.attr); + tz->trip_type_attrs[indx].attr.attr.name = + tz->trip_type_attrs[indx].name; + tz->trip_type_attrs[indx].attr.attr.mode = S_IRUGO; + tz->trip_type_attrs[indx].attr.show = trip_point_type_show; + + device_create_file(&tz->device, + &tz->trip_type_attrs[indx].attr); + + /* create trip temp attribute */ + snprintf(tz->trip_temp_attrs[indx].name, THERMAL_NAME_LENGTH, + "trip_point_%d_temp", indx); + + sysfs_attr_init(&tz->trip_temp_attrs[indx].attr.attr); + tz->trip_temp_attrs[indx].attr.attr.name = + tz->trip_temp_attrs[indx].name; + tz->trip_temp_attrs[indx].attr.attr.mode = S_IRUGO; + tz->trip_temp_attrs[indx].attr.show = trip_point_temp_show; + if (mask & (1 << indx)) { + tz->trip_temp_attrs[indx].attr.attr.mode |= S_IWUSR; + tz->trip_temp_attrs[indx].attr.store = + trip_point_temp_store; + } + + device_create_file(&tz->device, + &tz->trip_temp_attrs[indx].attr); + + /* create Optional trip hyst attribute */ + if (!tz->ops->get_trip_hyst) + continue; + snprintf(tz->trip_hyst_attrs[indx].name, THERMAL_NAME_LENGTH, + "trip_point_%d_hyst", indx); + + sysfs_attr_init(&tz->trip_hyst_attrs[indx].attr.attr); + tz->trip_hyst_attrs[indx].attr.attr.name = + tz->trip_hyst_attrs[indx].name; + tz->trip_hyst_attrs[indx].attr.attr.mode = S_IRUGO; + tz->trip_hyst_attrs[indx].attr.show = trip_point_hyst_show; + if (tz->ops->set_trip_hyst) { + tz->trip_hyst_attrs[indx].attr.attr.mode |= S_IWUSR; + tz->trip_hyst_attrs[indx].attr.store = + trip_point_hyst_store; + } + + device_create_file(&tz->device, + &tz->trip_hyst_attrs[indx].attr); + } + return 0; +} + +static void remove_trip_attrs(struct thermal_zone_device *tz) +{ + int indx; + + for (indx = 0; indx < tz->trips; indx++) { + device_remove_file(&tz->device, + &tz->trip_type_attrs[indx].attr); + device_remove_file(&tz->device, + &tz->trip_temp_attrs[indx].attr); + if (tz->ops->get_trip_hyst) + device_remove_file(&tz->device, + &tz->trip_hyst_attrs[indx].attr); + } + kfree(tz->trip_type_attrs); + kfree(tz->trip_temp_attrs); + kfree(tz->trip_hyst_attrs); +} + +/** + * thermal_zone_device_register() - register a new thermal zone device + * @type: the thermal zone device type + * @trips: the number of trip points the thermal zone support + * @mask: a bit string indicating the writeablility of trip points + * @devdata: private device data + * @ops: standard thermal zone device callbacks + * @tzp: thermal zone platform parameters + * @passive_delay: number of milliseconds to wait between polls when + * performing passive cooling + * @polling_delay: number of milliseconds to wait between polls when checking + * whether trip points have been crossed (0 for interrupt + * driven systems) + * + * This interface function adds a new thermal zone device (sensor) to + * /sys/class/thermal folder as thermal_zone[0-*]. It tries to bind all the + * thermal cooling devices registered at the same time. + * thermal_zone_device_unregister() must be called when the device is no + * longer needed. The passive cooling depends on the .get_trend() return value. + * + * Return: a pointer to the created struct thermal_zone_device or an + * in case of error, an ERR_PTR. Caller must check return value with + * IS_ERR*() helpers. + */ +struct thermal_zone_device *thermal_zone_device_register(const char *type, + int trips, int mask, void *devdata, + const struct thermal_zone_device_ops *ops, + const struct thermal_zone_params *tzp, + int passive_delay, int polling_delay) +{ + struct thermal_zone_device *tz; + enum thermal_trip_type trip_type; + int result; + int count; + int passive = 0; + + if (type && strlen(type) >= THERMAL_NAME_LENGTH) + return ERR_PTR(-EINVAL); + + if (trips > THERMAL_MAX_TRIPS || trips < 0 || mask >> trips) + return ERR_PTR(-EINVAL); + + if (!ops || !ops->get_temp) + return ERR_PTR(-EINVAL); + + if (trips > 0 && !ops->get_trip_type) + return ERR_PTR(-EINVAL); + + tz = kzalloc(sizeof(struct thermal_zone_device), GFP_KERNEL); + if (!tz) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&tz->thermal_instances); + idr_init(&tz->idr); + mutex_init(&tz->lock); + result = get_idr(&thermal_tz_idr, &thermal_idr_lock, &tz->id); + if (result) { + kfree(tz); + return ERR_PTR(result); + } + + strlcpy(tz->type, type ? : "", sizeof(tz->type)); + tz->ops = ops; + tz->tzp = tzp; + tz->device.class = &thermal_class; + tz->devdata = devdata; + tz->trips = trips; + tz->passive_delay = passive_delay; + tz->polling_delay = polling_delay; + + dev_set_name(&tz->device, "thermal_zone%d", tz->id); + result = device_register(&tz->device); + if (result) { + release_idr(&thermal_tz_idr, &thermal_idr_lock, tz->id); + kfree(tz); + return ERR_PTR(result); + } + + /* sys I/F */ + if (type) { + result = device_create_file(&tz->device, &dev_attr_type); + if (result) + goto unregister; + } + + result = device_create_file(&tz->device, &dev_attr_temp); + if (result) + goto unregister; + + if (ops->get_mode) { + result = device_create_file(&tz->device, &dev_attr_mode); + if (result) + goto unregister; + } + + result = create_trip_attrs(tz, mask); + if (result) + goto unregister; + + for (count = 0; count < trips; count++) { + tz->ops->get_trip_type(tz, count, &trip_type); + if (trip_type == THERMAL_TRIP_PASSIVE) + passive = 1; + } + + if (!passive) { + result = device_create_file(&tz->device, &dev_attr_passive); + if (result) + goto unregister; + } + +#ifdef CONFIG_THERMAL_EMULATION + result = device_create_file(&tz->device, &dev_attr_emul_temp); + if (result) + goto unregister; +#endif + /* Create policy attribute */ + result = device_create_file(&tz->device, &dev_attr_policy); + if (result) + goto unregister; + + /* Update 'this' zone's governor information */ + mutex_lock(&thermal_governor_lock); + + if (tz->tzp) + tz->governor = __find_governor(tz->tzp->governor_name); + else + tz->governor = __find_governor(DEFAULT_THERMAL_GOVERNOR); + + mutex_unlock(&thermal_governor_lock); + + result = thermal_add_hwmon_sysfs(tz); + if (result) + goto unregister; + + mutex_lock(&thermal_list_lock); + list_add_tail(&tz->node, &thermal_tz_list); + mutex_unlock(&thermal_list_lock); + + /* Bind cooling devices for this zone */ + bind_tz(tz); + + INIT_DELAYED_WORK(&(tz->poll_queue), thermal_zone_device_check); + + thermal_zone_device_update(tz); + + if (!result) + return tz; + +unregister: + release_idr(&thermal_tz_idr, &thermal_idr_lock, tz->id); + device_unregister(&tz->device); + return ERR_PTR(result); +} +EXPORT_SYMBOL_GPL(thermal_zone_device_register); + +/** + * thermal_device_unregister - removes the registered thermal zone device + * @tz: the thermal zone device to remove + */ +void thermal_zone_device_unregister(struct thermal_zone_device *tz) +{ + int i; + const struct thermal_zone_params *tzp; + struct thermal_cooling_device *cdev; + struct thermal_zone_device *pos = NULL; + + if (!tz) + return; + + tzp = tz->tzp; + + mutex_lock(&thermal_list_lock); + list_for_each_entry(pos, &thermal_tz_list, node) + if (pos == tz) + break; + if (pos != tz) { + /* thermal zone device not found */ + mutex_unlock(&thermal_list_lock); + return; + } + list_del(&tz->node); + + /* Unbind all cdevs associated with 'this' thermal zone */ + list_for_each_entry(cdev, &thermal_cdev_list, node) { + if (tz->ops->unbind) { + tz->ops->unbind(tz, cdev); + continue; + } + + if (!tzp || !tzp->tbp) + break; + + for (i = 0; i < tzp->num_tbps; i++) { + if (tzp->tbp[i].cdev == cdev) { + __unbind(tz, tzp->tbp[i].trip_mask, cdev); + tzp->tbp[i].cdev = NULL; + } + } + } + + mutex_unlock(&thermal_list_lock); + + thermal_zone_device_set_polling(tz, 0); + + if (tz->type[0]) + device_remove_file(&tz->device, &dev_attr_type); + device_remove_file(&tz->device, &dev_attr_temp); + if (tz->ops->get_mode) + device_remove_file(&tz->device, &dev_attr_mode); + device_remove_file(&tz->device, &dev_attr_policy); + remove_trip_attrs(tz); + tz->governor = NULL; + + thermal_remove_hwmon_sysfs(tz); + release_idr(&thermal_tz_idr, &thermal_idr_lock, tz->id); + idr_destroy(&tz->idr); + mutex_destroy(&tz->lock); + device_unregister(&tz->device); + return; +} +EXPORT_SYMBOL_GPL(thermal_zone_device_unregister); + +/** + * thermal_zone_get_zone_by_name() - search for a zone and returns its ref + * @name: thermal zone name to fetch the temperature + * + * When only one zone is found with the passed name, returns a reference to it. + * + * Return: On success returns a reference to an unique thermal zone with + * matching name equals to @name, an ERR_PTR otherwise (-EINVAL for invalid + * paramenters, -ENODEV for not found and -EEXIST for multiple matches). + */ +struct thermal_zone_device *thermal_zone_get_zone_by_name(const char *name) +{ + struct thermal_zone_device *pos = NULL, *ref = ERR_PTR(-EINVAL); + unsigned int found = 0; + + if (!name) + goto exit; + + mutex_lock(&thermal_list_lock); + list_for_each_entry(pos, &thermal_tz_list, node) + if (!strnicmp(name, pos->type, THERMAL_NAME_LENGTH)) { + found++; + ref = pos; + } + mutex_unlock(&thermal_list_lock); + + /* nothing has been found, thus an error code for it */ + if (found == 0) + ref = ERR_PTR(-ENODEV); + else if (found > 1) + /* Success only when an unique zone is found */ + ref = ERR_PTR(-EEXIST); + +exit: + return ref; +} +EXPORT_SYMBOL_GPL(thermal_zone_get_zone_by_name); + +#ifdef CONFIG_NET +static struct genl_family thermal_event_genl_family = { + .id = GENL_ID_GENERATE, + .name = THERMAL_GENL_FAMILY_NAME, + .version = THERMAL_GENL_VERSION, + .maxattr = THERMAL_GENL_ATTR_MAX, +}; + +static struct genl_multicast_group thermal_event_mcgrp = { + .name = THERMAL_GENL_MCAST_GROUP_NAME, +}; + +int thermal_generate_netlink_event(struct thermal_zone_device *tz, + enum events event) +{ + struct sk_buff *skb; + struct nlattr *attr; + struct thermal_genl_event *thermal_event; + void *msg_header; + int size; + int result; + static unsigned int thermal_event_seqnum; + + if (!tz) + return -EINVAL; + + /* allocate memory */ + size = nla_total_size(sizeof(struct thermal_genl_event)) + + nla_total_size(0); + + skb = genlmsg_new(size, GFP_ATOMIC); + if (!skb) + return -ENOMEM; + + /* add the genetlink message header */ + msg_header = genlmsg_put(skb, 0, thermal_event_seqnum++, + &thermal_event_genl_family, 0, + THERMAL_GENL_CMD_EVENT); + if (!msg_header) { + nlmsg_free(skb); + return -ENOMEM; + } + + /* fill the data */ + attr = nla_reserve(skb, THERMAL_GENL_ATTR_EVENT, + sizeof(struct thermal_genl_event)); + + if (!attr) { + nlmsg_free(skb); + return -EINVAL; + } + + thermal_event = nla_data(attr); + if (!thermal_event) { + nlmsg_free(skb); + return -EINVAL; + } + + memset(thermal_event, 0, sizeof(struct thermal_genl_event)); + + thermal_event->orig = tz->id; + thermal_event->event = event; + + /* send multicast genetlink message */ + result = genlmsg_end(skb, msg_header); + if (result < 0) { + nlmsg_free(skb); + return result; + } + + result = genlmsg_multicast(skb, 0, thermal_event_mcgrp.id, GFP_ATOMIC); + if (result) + dev_err(&tz->device, "Failed to send netlink event:%d", result); + + return result; +} +EXPORT_SYMBOL_GPL(thermal_generate_netlink_event); + +static int genetlink_init(void) +{ + int result; + + result = genl_register_family(&thermal_event_genl_family); + if (result) + return result; + + result = genl_register_mc_group(&thermal_event_genl_family, + &thermal_event_mcgrp); + if (result) + genl_unregister_family(&thermal_event_genl_family); + return result; +} + +static void genetlink_exit(void) +{ + genl_unregister_family(&thermal_event_genl_family); +} +#else /* !CONFIG_NET */ +static inline int genetlink_init(void) { return 0; } +static inline void genetlink_exit(void) {} +#endif /* !CONFIG_NET */ + +static int __init thermal_register_governors(void) +{ + int result; + + result = thermal_gov_step_wise_register(); + if (result) + return result; + + result = thermal_gov_fair_share_register(); + if (result) + return result; + + return thermal_gov_user_space_register(); +} + +static void thermal_unregister_governors(void) +{ + thermal_gov_step_wise_unregister(); + thermal_gov_fair_share_unregister(); + thermal_gov_user_space_unregister(); +} + +static int __init thermal_init(void) +{ + int result; + + result = thermal_register_governors(); + if (result) + goto error; + + result = class_register(&thermal_class); + if (result) + goto unregister_governors; + + result = genetlink_init(); + if (result) + goto unregister_class; + + return 0; + +unregister_governors: + thermal_unregister_governors(); +unregister_class: + class_unregister(&thermal_class); +error: + idr_destroy(&thermal_tz_idr); + idr_destroy(&thermal_cdev_idr); + mutex_destroy(&thermal_idr_lock); + mutex_destroy(&thermal_list_lock); + mutex_destroy(&thermal_governor_lock); + return result; +} + +static void __exit thermal_exit(void) +{ + genetlink_exit(); + class_unregister(&thermal_class); + thermal_unregister_governors(); + idr_destroy(&thermal_tz_idr); + idr_destroy(&thermal_cdev_idr); + mutex_destroy(&thermal_idr_lock); + mutex_destroy(&thermal_list_lock); + mutex_destroy(&thermal_governor_lock); +} + +fs_initcall(thermal_init); +module_exit(thermal_exit); diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h index 0d3205a18112..7cf2f6626251 100644 --- a/drivers/thermal/thermal_core.h +++ b/drivers/thermal/thermal_core.h @@ -50,4 +50,31 @@ struct thermal_instance { struct list_head cdev_node; /* node in cdev->thermal_instances */ }; +int thermal_register_governor(struct thermal_governor *); +void thermal_unregister_governor(struct thermal_governor *); + +#ifdef CONFIG_THERMAL_GOV_STEP_WISE +int thermal_gov_step_wise_register(void); +void thermal_gov_step_wise_unregister(void); +#else +static inline int thermal_gov_step_wise_register(void) { return 0; } +static inline void thermal_gov_step_wise_unregister(void) {} +#endif /* CONFIG_THERMAL_GOV_STEP_WISE */ + +#ifdef CONFIG_THERMAL_GOV_FAIR_SHARE +int thermal_gov_fair_share_register(void); +void thermal_gov_fair_share_unregister(void); +#else +static inline int thermal_gov_fair_share_register(void) { return 0; } +static inline void thermal_gov_fair_share_unregister(void) {} +#endif /* CONFIG_THERMAL_GOV_FAIR_SHARE */ + +#ifdef CONFIG_THERMAL_GOV_USER_SPACE +int thermal_gov_user_space_register(void); +void thermal_gov_user_space_unregister(void); +#else +static inline int thermal_gov_user_space_register(void) { return 0; } +static inline void thermal_gov_user_space_unregister(void) {} +#endif /* CONFIG_THERMAL_GOV_USER_SPACE */ + #endif /* __THERMAL_CORE_H__ */ diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c deleted file mode 100644 index 5b7863a03f98..000000000000 --- a/drivers/thermal/thermal_sys.c +++ /dev/null @@ -1,1888 +0,0 @@ -/* - * thermal.c - Generic Thermal Management Sysfs support. - * - * Copyright (C) 2008 Intel Corp - * Copyright (C) 2008 Zhang Rui - * Copyright (C) 2008 Sujith Thomas - * - * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; version 2 of the License. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public License along - * with this program; if not, write to the Free Software Foundation, Inc., - * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. - * - * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "thermal_core.h" - -MODULE_AUTHOR("Zhang Rui"); -MODULE_DESCRIPTION("Generic thermal management sysfs support"); -MODULE_LICENSE("GPL"); - -static DEFINE_IDR(thermal_tz_idr); -static DEFINE_IDR(thermal_cdev_idr); -static DEFINE_MUTEX(thermal_idr_lock); - -static LIST_HEAD(thermal_tz_list); -static LIST_HEAD(thermal_cdev_list); -static LIST_HEAD(thermal_governor_list); - -static DEFINE_MUTEX(thermal_list_lock); -static DEFINE_MUTEX(thermal_governor_lock); - -static struct thermal_governor *__find_governor(const char *name) -{ - struct thermal_governor *pos; - - list_for_each_entry(pos, &thermal_governor_list, governor_list) - if (!strnicmp(name, pos->name, THERMAL_NAME_LENGTH)) - return pos; - - return NULL; -} - -int thermal_register_governor(struct thermal_governor *governor) -{ - int err; - const char *name; - struct thermal_zone_device *pos; - - if (!governor) - return -EINVAL; - - mutex_lock(&thermal_governor_lock); - - err = -EBUSY; - if (__find_governor(governor->name) == NULL) { - err = 0; - list_add(&governor->governor_list, &thermal_governor_list); - } - - mutex_lock(&thermal_list_lock); - - list_for_each_entry(pos, &thermal_tz_list, node) { - if (pos->governor) - continue; - if (pos->tzp) - name = pos->tzp->governor_name; - else - name = DEFAULT_THERMAL_GOVERNOR; - if (!strnicmp(name, governor->name, THERMAL_NAME_LENGTH)) - pos->governor = governor; - } - - mutex_unlock(&thermal_list_lock); - mutex_unlock(&thermal_governor_lock); - - return err; -} -EXPORT_SYMBOL_GPL(thermal_register_governor); - -void thermal_unregister_governor(struct thermal_governor *governor) -{ - struct thermal_zone_device *pos; - - if (!governor) - return; - - mutex_lock(&thermal_governor_lock); - - if (__find_governor(governor->name) == NULL) - goto exit; - - mutex_lock(&thermal_list_lock); - - list_for_each_entry(pos, &thermal_tz_list, node) { - if (!strnicmp(pos->governor->name, governor->name, - THERMAL_NAME_LENGTH)) - pos->governor = NULL; - } - - mutex_unlock(&thermal_list_lock); - list_del(&governor->governor_list); -exit: - mutex_unlock(&thermal_governor_lock); - return; -} -EXPORT_SYMBOL_GPL(thermal_unregister_governor); - -static int get_idr(struct idr *idr, struct mutex *lock, int *id) -{ - int ret; - - if (lock) - mutex_lock(lock); - ret = idr_alloc(idr, NULL, 0, 0, GFP_KERNEL); - if (lock) - mutex_unlock(lock); - if (unlikely(ret < 0)) - return ret; - *id = ret; - return 0; -} - -static void release_idr(struct idr *idr, struct mutex *lock, int id) -{ - if (lock) - mutex_lock(lock); - idr_remove(idr, id); - if (lock) - mutex_unlock(lock); -} - -int get_tz_trend(struct thermal_zone_device *tz, int trip) -{ - enum thermal_trend trend; - - if (!tz->ops->get_trend || tz->ops->get_trend(tz, trip, &trend)) { - if (tz->temperature > tz->last_temperature) - trend = THERMAL_TREND_RAISING; - else if (tz->temperature < tz->last_temperature) - trend = THERMAL_TREND_DROPPING; - else - trend = THERMAL_TREND_STABLE; - } - - return trend; -} -EXPORT_SYMBOL(get_tz_trend); - -struct thermal_instance *get_thermal_instance(struct thermal_zone_device *tz, - struct thermal_cooling_device *cdev, int trip) -{ - struct thermal_instance *pos = NULL; - struct thermal_instance *target_instance = NULL; - - mutex_lock(&tz->lock); - mutex_lock(&cdev->lock); - - list_for_each_entry(pos, &tz->thermal_instances, tz_node) { - if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) { - target_instance = pos; - break; - } - } - - mutex_unlock(&cdev->lock); - mutex_unlock(&tz->lock); - - return target_instance; -} -EXPORT_SYMBOL(get_thermal_instance); - -static void print_bind_err_msg(struct thermal_zone_device *tz, - struct thermal_cooling_device *cdev, int ret) -{ - dev_err(&tz->device, "binding zone %s with cdev %s failed:%d\n", - tz->type, cdev->type, ret); -} - -static void __bind(struct thermal_zone_device *tz, int mask, - struct thermal_cooling_device *cdev) -{ - int i, ret; - - for (i = 0; i < tz->trips; i++) { - if (mask & (1 << i)) { - ret = thermal_zone_bind_cooling_device(tz, i, cdev, - THERMAL_NO_LIMIT, THERMAL_NO_LIMIT); - if (ret) - print_bind_err_msg(tz, cdev, ret); - } - } -} - -static void __unbind(struct thermal_zone_device *tz, int mask, - struct thermal_cooling_device *cdev) -{ - int i; - - for (i = 0; i < tz->trips; i++) - if (mask & (1 << i)) - thermal_zone_unbind_cooling_device(tz, i, cdev); -} - -static void bind_cdev(struct thermal_cooling_device *cdev) -{ - int i, ret; - const struct thermal_zone_params *tzp; - struct thermal_zone_device *pos = NULL; - - mutex_lock(&thermal_list_lock); - - list_for_each_entry(pos, &thermal_tz_list, node) { - if (!pos->tzp && !pos->ops->bind) - continue; - - if (!pos->tzp && pos->ops->bind) { - ret = pos->ops->bind(pos, cdev); - if (ret) - print_bind_err_msg(pos, cdev, ret); - } - - tzp = pos->tzp; - if (!tzp || !tzp->tbp) - continue; - - for (i = 0; i < tzp->num_tbps; i++) { - if (tzp->tbp[i].cdev || !tzp->tbp[i].match) - continue; - if (tzp->tbp[i].match(pos, cdev)) - continue; - tzp->tbp[i].cdev = cdev; - __bind(pos, tzp->tbp[i].trip_mask, cdev); - } - } - - mutex_unlock(&thermal_list_lock); -} - -static void bind_tz(struct thermal_zone_device *tz) -{ - int i, ret; - struct thermal_cooling_device *pos = NULL; - const struct thermal_zone_params *tzp = tz->tzp; - - if (!tzp && !tz->ops->bind) - return; - - mutex_lock(&thermal_list_lock); - - /* If there is no platform data, try to use ops->bind */ - if (!tzp && tz->ops->bind) { - list_for_each_entry(pos, &thermal_cdev_list, node) { - ret = tz->ops->bind(tz, pos); - if (ret) - print_bind_err_msg(tz, pos, ret); - } - goto exit; - } - - if (!tzp || !tzp->tbp) - goto exit; - - list_for_each_entry(pos, &thermal_cdev_list, node) { - for (i = 0; i < tzp->num_tbps; i++) { - if (tzp->tbp[i].cdev || !tzp->tbp[i].match) - continue; - if (tzp->tbp[i].match(tz, pos)) - continue; - tzp->tbp[i].cdev = pos; - __bind(tz, tzp->tbp[i].trip_mask, pos); - } - } -exit: - mutex_unlock(&thermal_list_lock); -} - -static void thermal_zone_device_set_polling(struct thermal_zone_device *tz, - int delay) -{ - if (delay > 1000) - mod_delayed_work(system_freezable_wq, &tz->poll_queue, - round_jiffies(msecs_to_jiffies(delay))); - else if (delay) - mod_delayed_work(system_freezable_wq, &tz->poll_queue, - msecs_to_jiffies(delay)); - else - cancel_delayed_work(&tz->poll_queue); -} - -static void monitor_thermal_zone(struct thermal_zone_device *tz) -{ - mutex_lock(&tz->lock); - - if (tz->passive) - thermal_zone_device_set_polling(tz, tz->passive_delay); - else if (tz->polling_delay) - thermal_zone_device_set_polling(tz, tz->polling_delay); - else - thermal_zone_device_set_polling(tz, 0); - - mutex_unlock(&tz->lock); -} - -static void handle_non_critical_trips(struct thermal_zone_device *tz, - int trip, enum thermal_trip_type trip_type) -{ - if (tz->governor) - tz->governor->throttle(tz, trip); -} - -static void handle_critical_trips(struct thermal_zone_device *tz, - int trip, enum thermal_trip_type trip_type) -{ - long trip_temp; - - tz->ops->get_trip_temp(tz, trip, &trip_temp); - - /* If we have not crossed the trip_temp, we do not care. */ - if (tz->temperature < trip_temp) - return; - - if (tz->ops->notify) - tz->ops->notify(tz, trip, trip_type); - - if (trip_type == THERMAL_TRIP_CRITICAL) { - dev_emerg(&tz->device, - "critical temperature reached(%d C),shutting down\n", - tz->temperature / 1000); - orderly_poweroff(true); - } -} - -static void handle_thermal_trip(struct thermal_zone_device *tz, int trip) -{ - enum thermal_trip_type type; - - tz->ops->get_trip_type(tz, trip, &type); - - if (type == THERMAL_TRIP_CRITICAL || type == THERMAL_TRIP_HOT) - handle_critical_trips(tz, trip, type); - else - handle_non_critical_trips(tz, trip, type); - /* - * Alright, we handled this trip successfully. - * So, start monitoring again. - */ - monitor_thermal_zone(tz); -} - -static int thermal_zone_get_temp(struct thermal_zone_device *tz, - unsigned long *temp) -{ - int ret = 0; -#ifdef CONFIG_THERMAL_EMULATION - int count; - unsigned long crit_temp = -1UL; - enum thermal_trip_type type; -#endif - - mutex_lock(&tz->lock); - - ret = tz->ops->get_temp(tz, temp); -#ifdef CONFIG_THERMAL_EMULATION - if (!tz->emul_temperature) - goto skip_emul; - - for (count = 0; count < tz->trips; count++) { - ret = tz->ops->get_trip_type(tz, count, &type); - if (!ret && type == THERMAL_TRIP_CRITICAL) { - ret = tz->ops->get_trip_temp(tz, count, &crit_temp); - break; - } - } - - if (ret) - goto skip_emul; - - if (*temp < crit_temp) - *temp = tz->emul_temperature; -skip_emul: -#endif - mutex_unlock(&tz->lock); - return ret; -} - -static void update_temperature(struct thermal_zone_device *tz) -{ - long temp; - int ret; - - ret = thermal_zone_get_temp(tz, &temp); - if (ret) { - dev_warn(&tz->device, "failed to read out thermal zone %d\n", - tz->id); - return; - } - - mutex_lock(&tz->lock); - tz->last_temperature = tz->temperature; - tz->temperature = temp; - mutex_unlock(&tz->lock); -} - -void thermal_zone_device_update(struct thermal_zone_device *tz) -{ - int count; - - update_temperature(tz); - - for (count = 0; count < tz->trips; count++) - handle_thermal_trip(tz, count); -} -EXPORT_SYMBOL(thermal_zone_device_update); - -static void thermal_zone_device_check(struct work_struct *work) -{ - struct thermal_zone_device *tz = container_of(work, struct - thermal_zone_device, - poll_queue.work); - thermal_zone_device_update(tz); -} - -/* sys I/F for thermal zone */ - -#define to_thermal_zone(_dev) \ - container_of(_dev, struct thermal_zone_device, device) - -static ssize_t -type_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - - return sprintf(buf, "%s\n", tz->type); -} - -static ssize_t -temp_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - long temperature; - int ret; - - ret = thermal_zone_get_temp(tz, &temperature); - - if (ret) - return ret; - - return sprintf(buf, "%ld\n", temperature); -} - -static ssize_t -mode_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - enum thermal_device_mode mode; - int result; - - if (!tz->ops->get_mode) - return -EPERM; - - result = tz->ops->get_mode(tz, &mode); - if (result) - return result; - - return sprintf(buf, "%s\n", mode == THERMAL_DEVICE_ENABLED ? "enabled" - : "disabled"); -} - -static ssize_t -mode_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - int result; - - if (!tz->ops->set_mode) - return -EPERM; - - if (!strncmp(buf, "enabled", sizeof("enabled") - 1)) - result = tz->ops->set_mode(tz, THERMAL_DEVICE_ENABLED); - else if (!strncmp(buf, "disabled", sizeof("disabled") - 1)) - result = tz->ops->set_mode(tz, THERMAL_DEVICE_DISABLED); - else - result = -EINVAL; - - if (result) - return result; - - return count; -} - -static ssize_t -trip_point_type_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - enum thermal_trip_type type; - int trip, result; - - if (!tz->ops->get_trip_type) - return -EPERM; - - if (!sscanf(attr->attr.name, "trip_point_%d_type", &trip)) - return -EINVAL; - - result = tz->ops->get_trip_type(tz, trip, &type); - if (result) - return result; - - switch (type) { - case THERMAL_TRIP_CRITICAL: - return sprintf(buf, "critical\n"); - case THERMAL_TRIP_HOT: - return sprintf(buf, "hot\n"); - case THERMAL_TRIP_PASSIVE: - return sprintf(buf, "passive\n"); - case THERMAL_TRIP_ACTIVE: - return sprintf(buf, "active\n"); - default: - return sprintf(buf, "unknown\n"); - } -} - -static ssize_t -trip_point_temp_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - int trip, ret; - unsigned long temperature; - - if (!tz->ops->set_trip_temp) - return -EPERM; - - if (!sscanf(attr->attr.name, "trip_point_%d_temp", &trip)) - return -EINVAL; - - if (kstrtoul(buf, 10, &temperature)) - return -EINVAL; - - ret = tz->ops->set_trip_temp(tz, trip, temperature); - - return ret ? ret : count; -} - -static ssize_t -trip_point_temp_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - int trip, ret; - long temperature; - - if (!tz->ops->get_trip_temp) - return -EPERM; - - if (!sscanf(attr->attr.name, "trip_point_%d_temp", &trip)) - return -EINVAL; - - ret = tz->ops->get_trip_temp(tz, trip, &temperature); - - if (ret) - return ret; - - return sprintf(buf, "%ld\n", temperature); -} - -static ssize_t -trip_point_hyst_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - int trip, ret; - unsigned long temperature; - - if (!tz->ops->set_trip_hyst) - return -EPERM; - - if (!sscanf(attr->attr.name, "trip_point_%d_hyst", &trip)) - return -EINVAL; - - if (kstrtoul(buf, 10, &temperature)) - return -EINVAL; - - /* - * We are not doing any check on the 'temperature' value - * here. The driver implementing 'set_trip_hyst' has to - * take care of this. - */ - ret = tz->ops->set_trip_hyst(tz, trip, temperature); - - return ret ? ret : count; -} - -static ssize_t -trip_point_hyst_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - int trip, ret; - unsigned long temperature; - - if (!tz->ops->get_trip_hyst) - return -EPERM; - - if (!sscanf(attr->attr.name, "trip_point_%d_hyst", &trip)) - return -EINVAL; - - ret = tz->ops->get_trip_hyst(tz, trip, &temperature); - - return ret ? ret : sprintf(buf, "%ld\n", temperature); -} - -static ssize_t -passive_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - struct thermal_cooling_device *cdev = NULL; - int state; - - if (!sscanf(buf, "%d\n", &state)) - return -EINVAL; - - /* sanity check: values below 1000 millicelcius don't make sense - * and can cause the system to go into a thermal heart attack - */ - if (state && state < 1000) - return -EINVAL; - - if (state && !tz->forced_passive) { - mutex_lock(&thermal_list_lock); - list_for_each_entry(cdev, &thermal_cdev_list, node) { - if (!strncmp("Processor", cdev->type, - sizeof("Processor"))) - thermal_zone_bind_cooling_device(tz, - THERMAL_TRIPS_NONE, cdev, - THERMAL_NO_LIMIT, - THERMAL_NO_LIMIT); - } - mutex_unlock(&thermal_list_lock); - if (!tz->passive_delay) - tz->passive_delay = 1000; - } else if (!state && tz->forced_passive) { - mutex_lock(&thermal_list_lock); - list_for_each_entry(cdev, &thermal_cdev_list, node) { - if (!strncmp("Processor", cdev->type, - sizeof("Processor"))) - thermal_zone_unbind_cooling_device(tz, - THERMAL_TRIPS_NONE, - cdev); - } - mutex_unlock(&thermal_list_lock); - tz->passive_delay = 0; - } - - tz->forced_passive = state; - - thermal_zone_device_update(tz); - - return count; -} - -static ssize_t -passive_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - - return sprintf(buf, "%d\n", tz->forced_passive); -} - -static ssize_t -policy_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) -{ - int ret = -EINVAL; - struct thermal_zone_device *tz = to_thermal_zone(dev); - struct thermal_governor *gov; - - mutex_lock(&thermal_governor_lock); - - gov = __find_governor(buf); - if (!gov) - goto exit; - - tz->governor = gov; - ret = count; - -exit: - mutex_unlock(&thermal_governor_lock); - return ret; -} - -static ssize_t -policy_show(struct device *dev, struct device_attribute *devattr, char *buf) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - - return sprintf(buf, "%s\n", tz->governor->name); -} - -#ifdef CONFIG_THERMAL_EMULATION -static ssize_t -emul_temp_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) -{ - struct thermal_zone_device *tz = to_thermal_zone(dev); - int ret = 0; - unsigned long temperature; - - if (kstrtoul(buf, 10, &temperature)) - return -EINVAL; - - if (!tz->ops->set_emul_temp) { - mutex_lock(&tz->lock); - tz->emul_temperature = temperature; - mutex_unlock(&tz->lock); - } else { - ret = tz->ops->set_emul_temp(tz, temperature); - } - - return ret ? ret : count; -} -static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store); -#endif/*CONFIG_THERMAL_EMULATION*/ - -static DEVICE_ATTR(type, 0444, type_show, NULL); -static DEVICE_ATTR(temp, 0444, temp_show, NULL); -static DEVICE_ATTR(mode, 0644, mode_show, mode_store); -static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store); -static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store); - -/* sys I/F for cooling device */ -#define to_cooling_device(_dev) \ - container_of(_dev, struct thermal_cooling_device, device) - -static ssize_t -thermal_cooling_device_type_show(struct device *dev, - struct device_attribute *attr, char *buf) -{ - struct thermal_cooling_device *cdev = to_cooling_device(dev); - - return sprintf(buf, "%s\n", cdev->type); -} - -static ssize_t -thermal_cooling_device_max_state_show(struct device *dev, - struct device_attribute *attr, char *buf) -{ - struct thermal_cooling_device *cdev = to_cooling_device(dev); - unsigned long state; - int ret; - - ret = cdev->ops->get_max_state(cdev, &state); - if (ret) - return ret; - return sprintf(buf, "%ld\n", state); -} - -static ssize_t -thermal_cooling_device_cur_state_show(struct device *dev, - struct device_attribute *attr, char *buf) -{ - struct thermal_cooling_device *cdev = to_cooling_device(dev); - unsigned long state; - int ret; - - ret = cdev->ops->get_cur_state(cdev, &state); - if (ret) - return ret; - return sprintf(buf, "%ld\n", state); -} - -static ssize_t -thermal_cooling_device_cur_state_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t count) -{ - struct thermal_cooling_device *cdev = to_cooling_device(dev); - unsigned long state; - int result; - - if (!sscanf(buf, "%ld\n", &state)) - return -EINVAL; - - if ((long)state < 0) - return -EINVAL; - - result = cdev->ops->set_cur_state(cdev, state); - if (result) - return result; - return count; -} - -static struct device_attribute dev_attr_cdev_type = -__ATTR(type, 0444, thermal_cooling_device_type_show, NULL); -static DEVICE_ATTR(max_state, 0444, - thermal_cooling_device_max_state_show, NULL); -static DEVICE_ATTR(cur_state, 0644, - thermal_cooling_device_cur_state_show, - thermal_cooling_device_cur_state_store); - -static ssize_t -thermal_cooling_device_trip_point_show(struct device *dev, - struct device_attribute *attr, char *buf) -{ - struct thermal_instance *instance; - - instance = - container_of(attr, struct thermal_instance, attr); - - if (instance->trip == THERMAL_TRIPS_NONE) - return sprintf(buf, "-1\n"); - else - return sprintf(buf, "%d\n", instance->trip); -} - -/* Device management */ - -#if defined(CONFIG_THERMAL_HWMON) - -/* hwmon sys I/F */ -#include - -/* thermal zone devices with the same type share one hwmon device */ -struct thermal_hwmon_device { - char type[THERMAL_NAME_LENGTH]; - struct device *device; - int count; - struct list_head tz_list; - struct list_head node; -}; - -struct thermal_hwmon_attr { - struct device_attribute attr; - char name[16]; -}; - -/* one temperature input for each thermal zone */ -struct thermal_hwmon_temp { - struct list_head hwmon_node; - struct thermal_zone_device *tz; - struct thermal_hwmon_attr temp_input; /* hwmon sys attr */ - struct thermal_hwmon_attr temp_crit; /* hwmon sys attr */ -}; - -static LIST_HEAD(thermal_hwmon_list); - -static ssize_t -name_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct thermal_hwmon_device *hwmon = dev_get_drvdata(dev); - return sprintf(buf, "%s\n", hwmon->type); -} -static DEVICE_ATTR(name, 0444, name_show, NULL); - -static ssize_t -temp_input_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - long temperature; - int ret; - struct thermal_hwmon_attr *hwmon_attr - = container_of(attr, struct thermal_hwmon_attr, attr); - struct thermal_hwmon_temp *temp - = container_of(hwmon_attr, struct thermal_hwmon_temp, - temp_input); - struct thermal_zone_device *tz = temp->tz; - - ret = thermal_zone_get_temp(tz, &temperature); - - if (ret) - return ret; - - return sprintf(buf, "%ld\n", temperature); -} - -static ssize_t -temp_crit_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct thermal_hwmon_attr *hwmon_attr - = container_of(attr, struct thermal_hwmon_attr, attr); - struct thermal_hwmon_temp *temp - = container_of(hwmon_attr, struct thermal_hwmon_temp, - temp_crit); - struct thermal_zone_device *tz = temp->tz; - long temperature; - int ret; - - ret = tz->ops->get_trip_temp(tz, 0, &temperature); - if (ret) - return ret; - - return sprintf(buf, "%ld\n", temperature); -} - - -static struct thermal_hwmon_device * -thermal_hwmon_lookup_by_type(const struct thermal_zone_device *tz) -{ - struct thermal_hwmon_device *hwmon; - - mutex_lock(&thermal_list_lock); - list_for_each_entry(hwmon, &thermal_hwmon_list, node) - if (!strcmp(hwmon->type, tz->type)) { - mutex_unlock(&thermal_list_lock); - return hwmon; - } - mutex_unlock(&thermal_list_lock); - - return NULL; -} - -/* Find the temperature input matching a given thermal zone */ -static struct thermal_hwmon_temp * -thermal_hwmon_lookup_temp(const struct thermal_hwmon_device *hwmon, - const struct thermal_zone_device *tz) -{ - struct thermal_hwmon_temp *temp; - - mutex_lock(&thermal_list_lock); - list_for_each_entry(temp, &hwmon->tz_list, hwmon_node) - if (temp->tz == tz) { - mutex_unlock(&thermal_list_lock); - return temp; - } - mutex_unlock(&thermal_list_lock); - - return NULL; -} - -static int -thermal_add_hwmon_sysfs(struct thermal_zone_device *tz) -{ - struct thermal_hwmon_device *hwmon; - struct thermal_hwmon_temp *temp; - int new_hwmon_device = 1; - int result; - - hwmon = thermal_hwmon_lookup_by_type(tz); - if (hwmon) { - new_hwmon_device = 0; - goto register_sys_interface; - } - - hwmon = kzalloc(sizeof(struct thermal_hwmon_device), GFP_KERNEL); - if (!hwmon) - return -ENOMEM; - - INIT_LIST_HEAD(&hwmon->tz_list); - strlcpy(hwmon->type, tz->type, THERMAL_NAME_LENGTH); - hwmon->device = hwmon_device_register(NULL); - if (IS_ERR(hwmon->device)) { - result = PTR_ERR(hwmon->device); - goto free_mem; - } - dev_set_drvdata(hwmon->device, hwmon); - result = device_create_file(hwmon->device, &dev_attr_name); - if (result) - goto free_mem; - - register_sys_interface: - temp = kzalloc(sizeof(struct thermal_hwmon_temp), GFP_KERNEL); - if (!temp) { - result = -ENOMEM; - goto unregister_name; - } - - temp->tz = tz; - hwmon->count++; - - snprintf(temp->temp_input.name, sizeof(temp->temp_input.name), - "temp%d_input", hwmon->count); - temp->temp_input.attr.attr.name = temp->temp_input.name; - temp->temp_input.attr.attr.mode = 0444; - temp->temp_input.attr.show = temp_input_show; - sysfs_attr_init(&temp->temp_input.attr.attr); - result = device_create_file(hwmon->device, &temp->temp_input.attr); - if (result) - goto free_temp_mem; - - if (tz->ops->get_crit_temp) { - unsigned long temperature; - if (!tz->ops->get_crit_temp(tz, &temperature)) { - snprintf(temp->temp_crit.name, - sizeof(temp->temp_crit.name), - "temp%d_crit", hwmon->count); - temp->temp_crit.attr.attr.name = temp->temp_crit.name; - temp->temp_crit.attr.attr.mode = 0444; - temp->temp_crit.attr.show = temp_crit_show; - sysfs_attr_init(&temp->temp_crit.attr.attr); - result = device_create_file(hwmon->device, - &temp->temp_crit.attr); - if (result) - goto unregister_input; - } - } - - mutex_lock(&thermal_list_lock); - if (new_hwmon_device) - list_add_tail(&hwmon->node, &thermal_hwmon_list); - list_add_tail(&temp->hwmon_node, &hwmon->tz_list); - mutex_unlock(&thermal_list_lock); - - return 0; - - unregister_input: - device_remove_file(hwmon->device, &temp->temp_input.attr); - free_temp_mem: - kfree(temp); - unregister_name: - if (new_hwmon_device) { - device_remove_file(hwmon->device, &dev_attr_name); - hwmon_device_unregister(hwmon->device); - } - free_mem: - if (new_hwmon_device) - kfree(hwmon); - - return result; -} - -static void -thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz) -{ - struct thermal_hwmon_device *hwmon; - struct thermal_hwmon_temp *temp; - - hwmon = thermal_hwmon_lookup_by_type(tz); - if (unlikely(!hwmon)) { - /* Should never happen... */ - dev_dbg(&tz->device, "hwmon device lookup failed!\n"); - return; - } - - temp = thermal_hwmon_lookup_temp(hwmon, tz); - if (unlikely(!temp)) { - /* Should never happen... */ - dev_dbg(&tz->device, "temperature input lookup failed!\n"); - return; - } - - device_remove_file(hwmon->device, &temp->temp_input.attr); - if (tz->ops->get_crit_temp) - device_remove_file(hwmon->device, &temp->temp_crit.attr); - - mutex_lock(&thermal_list_lock); - list_del(&temp->hwmon_node); - kfree(temp); - if (!list_empty(&hwmon->tz_list)) { - mutex_unlock(&thermal_list_lock); - return; - } - list_del(&hwmon->node); - mutex_unlock(&thermal_list_lock); - - device_remove_file(hwmon->device, &dev_attr_name); - hwmon_device_unregister(hwmon->device); - kfree(hwmon); -} -#else -static int -thermal_add_hwmon_sysfs(struct thermal_zone_device *tz) -{ - return 0; -} - -static void -thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz) -{ -} -#endif - -/** - * thermal_zone_bind_cooling_device - bind a cooling device to a thermal zone - * @tz: thermal zone device - * @trip: indicates which trip point the cooling devices is - * associated with in this thermal zone. - * @cdev: thermal cooling device - * - * This function is usually called in the thermal zone device .bind callback. - */ -int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz, - int trip, - struct thermal_cooling_device *cdev, - unsigned long upper, unsigned long lower) -{ - struct thermal_instance *dev; - struct thermal_instance *pos; - struct thermal_zone_device *pos1; - struct thermal_cooling_device *pos2; - unsigned long max_state; - int result; - - if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE)) - return -EINVAL; - - list_for_each_entry(pos1, &thermal_tz_list, node) { - if (pos1 == tz) - break; - } - list_for_each_entry(pos2, &thermal_cdev_list, node) { - if (pos2 == cdev) - break; - } - - if (tz != pos1 || cdev != pos2) - return -EINVAL; - - cdev->ops->get_max_state(cdev, &max_state); - - /* lower default 0, upper default max_state */ - lower = lower == THERMAL_NO_LIMIT ? 0 : lower; - upper = upper == THERMAL_NO_LIMIT ? max_state : upper; - - if (lower > upper || upper > max_state) - return -EINVAL; - - dev = - kzalloc(sizeof(struct thermal_instance), GFP_KERNEL); - if (!dev) - return -ENOMEM; - dev->tz = tz; - dev->cdev = cdev; - dev->trip = trip; - dev->upper = upper; - dev->lower = lower; - dev->target = THERMAL_NO_TARGET; - - result = get_idr(&tz->idr, &tz->lock, &dev->id); - if (result) - goto free_mem; - - sprintf(dev->name, "cdev%d", dev->id); - result = - sysfs_create_link(&tz->device.kobj, &cdev->device.kobj, dev->name); - if (result) - goto release_idr; - - sprintf(dev->attr_name, "cdev%d_trip_point", dev->id); - sysfs_attr_init(&dev->attr.attr); - dev->attr.attr.name = dev->attr_name; - dev->attr.attr.mode = 0444; - dev->attr.show = thermal_cooling_device_trip_point_show; - result = device_create_file(&tz->device, &dev->attr); - if (result) - goto remove_symbol_link; - - mutex_lock(&tz->lock); - mutex_lock(&cdev->lock); - list_for_each_entry(pos, &tz->thermal_instances, tz_node) - if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) { - result = -EEXIST; - break; - } - if (!result) { - list_add_tail(&dev->tz_node, &tz->thermal_instances); - list_add_tail(&dev->cdev_node, &cdev->thermal_instances); - } - mutex_unlock(&cdev->lock); - mutex_unlock(&tz->lock); - - if (!result) - return 0; - - device_remove_file(&tz->device, &dev->attr); -remove_symbol_link: - sysfs_remove_link(&tz->device.kobj, dev->name); -release_idr: - release_idr(&tz->idr, &tz->lock, dev->id); -free_mem: - kfree(dev); - return result; -} -EXPORT_SYMBOL(thermal_zone_bind_cooling_device); - -/** - * thermal_zone_unbind_cooling_device - unbind a cooling device from a thermal zone - * @tz: thermal zone device - * @trip: indicates which trip point the cooling devices is - * associated with in this thermal zone. - * @cdev: thermal cooling device - * - * This function is usually called in the thermal zone device .unbind callback. - */ -int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz, - int trip, - struct thermal_cooling_device *cdev) -{ - struct thermal_instance *pos, *next; - - mutex_lock(&tz->lock); - mutex_lock(&cdev->lock); - list_for_each_entry_safe(pos, next, &tz->thermal_instances, tz_node) { - if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) { - list_del(&pos->tz_node); - list_del(&pos->cdev_node); - mutex_unlock(&cdev->lock); - mutex_unlock(&tz->lock); - goto unbind; - } - } - mutex_unlock(&cdev->lock); - mutex_unlock(&tz->lock); - - return -ENODEV; - -unbind: - device_remove_file(&tz->device, &pos->attr); - sysfs_remove_link(&tz->device.kobj, pos->name); - release_idr(&tz->idr, &tz->lock, pos->id); - kfree(pos); - return 0; -} -EXPORT_SYMBOL(thermal_zone_unbind_cooling_device); - -static void thermal_release(struct device *dev) -{ - struct thermal_zone_device *tz; - struct thermal_cooling_device *cdev; - - if (!strncmp(dev_name(dev), "thermal_zone", - sizeof("thermal_zone") - 1)) { - tz = to_thermal_zone(dev); - kfree(tz); - } else { - cdev = to_cooling_device(dev); - kfree(cdev); - } -} - -static struct class thermal_class = { - .name = "thermal", - .dev_release = thermal_release, -}; - -/** - * thermal_cooling_device_register - register a new thermal cooling device - * @type: the thermal cooling device type. - * @devdata: device private data. - * @ops: standard thermal cooling devices callbacks. - */ -struct thermal_cooling_device * -thermal_cooling_device_register(char *type, void *devdata, - const struct thermal_cooling_device_ops *ops) -{ - struct thermal_cooling_device *cdev; - int result; - - if (type && strlen(type) >= THERMAL_NAME_LENGTH) - return ERR_PTR(-EINVAL); - - if (!ops || !ops->get_max_state || !ops->get_cur_state || - !ops->set_cur_state) - return ERR_PTR(-EINVAL); - - cdev = kzalloc(sizeof(struct thermal_cooling_device), GFP_KERNEL); - if (!cdev) - return ERR_PTR(-ENOMEM); - - result = get_idr(&thermal_cdev_idr, &thermal_idr_lock, &cdev->id); - if (result) { - kfree(cdev); - return ERR_PTR(result); - } - - strcpy(cdev->type, type ? : ""); - mutex_init(&cdev->lock); - INIT_LIST_HEAD(&cdev->thermal_instances); - cdev->ops = ops; - cdev->updated = true; - cdev->device.class = &thermal_class; - cdev->devdata = devdata; - dev_set_name(&cdev->device, "cooling_device%d", cdev->id); - result = device_register(&cdev->device); - if (result) { - release_idr(&thermal_cdev_idr, &thermal_idr_lock, cdev->id); - kfree(cdev); - return ERR_PTR(result); - } - - /* sys I/F */ - if (type) { - result = device_create_file(&cdev->device, &dev_attr_cdev_type); - if (result) - goto unregister; - } - - result = device_create_file(&cdev->device, &dev_attr_max_state); - if (result) - goto unregister; - - result = device_create_file(&cdev->device, &dev_attr_cur_state); - if (result) - goto unregister; - - /* Add 'this' new cdev to the global cdev list */ - mutex_lock(&thermal_list_lock); - list_add(&cdev->node, &thermal_cdev_list); - mutex_unlock(&thermal_list_lock); - - /* Update binding information for 'this' new cdev */ - bind_cdev(cdev); - - return cdev; - -unregister: - release_idr(&thermal_cdev_idr, &thermal_idr_lock, cdev->id); - device_unregister(&cdev->device); - return ERR_PTR(result); -} -EXPORT_SYMBOL(thermal_cooling_device_register); - -/** - * thermal_cooling_device_unregister - removes the registered thermal cooling device - * @cdev: the thermal cooling device to remove. - * - * thermal_cooling_device_unregister() must be called when the device is no - * longer needed. - */ -void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) -{ - int i; - const struct thermal_zone_params *tzp; - struct thermal_zone_device *tz; - struct thermal_cooling_device *pos = NULL; - - if (!cdev) - return; - - mutex_lock(&thermal_list_lock); - list_for_each_entry(pos, &thermal_cdev_list, node) - if (pos == cdev) - break; - if (pos != cdev) { - /* thermal cooling device not found */ - mutex_unlock(&thermal_list_lock); - return; - } - list_del(&cdev->node); - - /* Unbind all thermal zones associated with 'this' cdev */ - list_for_each_entry(tz, &thermal_tz_list, node) { - if (tz->ops->unbind) { - tz->ops->unbind(tz, cdev); - continue; - } - - if (!tz->tzp || !tz->tzp->tbp) - continue; - - tzp = tz->tzp; - for (i = 0; i < tzp->num_tbps; i++) { - if (tzp->tbp[i].cdev == cdev) { - __unbind(tz, tzp->tbp[i].trip_mask, cdev); - tzp->tbp[i].cdev = NULL; - } - } - } - - mutex_unlock(&thermal_list_lock); - - if (cdev->type[0]) - device_remove_file(&cdev->device, &dev_attr_cdev_type); - device_remove_file(&cdev->device, &dev_attr_max_state); - device_remove_file(&cdev->device, &dev_attr_cur_state); - - release_idr(&thermal_cdev_idr, &thermal_idr_lock, cdev->id); - device_unregister(&cdev->device); - return; -} -EXPORT_SYMBOL(thermal_cooling_device_unregister); - -void thermal_cdev_update(struct thermal_cooling_device *cdev) -{ - struct thermal_instance *instance; - unsigned long target = 0; - - /* cooling device is updated*/ - if (cdev->updated) - return; - - mutex_lock(&cdev->lock); - /* Make sure cdev enters the deepest cooling state */ - list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) { - if (instance->target == THERMAL_NO_TARGET) - continue; - if (instance->target > target) - target = instance->target; - } - mutex_unlock(&cdev->lock); - cdev->ops->set_cur_state(cdev, target); - cdev->updated = true; -} -EXPORT_SYMBOL(thermal_cdev_update); - -/** - * notify_thermal_framework - Sensor drivers use this API to notify framework - * @tz: thermal zone device - * @trip: indicates which trip point has been crossed - * - * This function handles the trip events from sensor drivers. It starts - * throttling the cooling devices according to the policy configured. - * For CRITICAL and HOT trip points, this notifies the respective drivers, - * and does actual throttling for other trip points i.e ACTIVE and PASSIVE. - * The throttling policy is based on the configured platform data; if no - * platform data is provided, this uses the step_wise throttling policy. - */ -void notify_thermal_framework(struct thermal_zone_device *tz, int trip) -{ - handle_thermal_trip(tz, trip); -} -EXPORT_SYMBOL(notify_thermal_framework); - -/** - * create_trip_attrs - create attributes for trip points - * @tz: the thermal zone device - * @mask: Writeable trip point bitmap. - */ -static int create_trip_attrs(struct thermal_zone_device *tz, int mask) -{ - int indx; - int size = sizeof(struct thermal_attr) * tz->trips; - - tz->trip_type_attrs = kzalloc(size, GFP_KERNEL); - if (!tz->trip_type_attrs) - return -ENOMEM; - - tz->trip_temp_attrs = kzalloc(size, GFP_KERNEL); - if (!tz->trip_temp_attrs) { - kfree(tz->trip_type_attrs); - return -ENOMEM; - } - - if (tz->ops->get_trip_hyst) { - tz->trip_hyst_attrs = kzalloc(size, GFP_KERNEL); - if (!tz->trip_hyst_attrs) { - kfree(tz->trip_type_attrs); - kfree(tz->trip_temp_attrs); - return -ENOMEM; - } - } - - - for (indx = 0; indx < tz->trips; indx++) { - /* create trip type attribute */ - snprintf(tz->trip_type_attrs[indx].name, THERMAL_NAME_LENGTH, - "trip_point_%d_type", indx); - - sysfs_attr_init(&tz->trip_type_attrs[indx].attr.attr); - tz->trip_type_attrs[indx].attr.attr.name = - tz->trip_type_attrs[indx].name; - tz->trip_type_attrs[indx].attr.attr.mode = S_IRUGO; - tz->trip_type_attrs[indx].attr.show = trip_point_type_show; - - device_create_file(&tz->device, - &tz->trip_type_attrs[indx].attr); - - /* create trip temp attribute */ - snprintf(tz->trip_temp_attrs[indx].name, THERMAL_NAME_LENGTH, - "trip_point_%d_temp", indx); - - sysfs_attr_init(&tz->trip_temp_attrs[indx].attr.attr); - tz->trip_temp_attrs[indx].attr.attr.name = - tz->trip_temp_attrs[indx].name; - tz->trip_temp_attrs[indx].attr.attr.mode = S_IRUGO; - tz->trip_temp_attrs[indx].attr.show = trip_point_temp_show; - if (mask & (1 << indx)) { - tz->trip_temp_attrs[indx].attr.attr.mode |= S_IWUSR; - tz->trip_temp_attrs[indx].attr.store = - trip_point_temp_store; - } - - device_create_file(&tz->device, - &tz->trip_temp_attrs[indx].attr); - - /* create Optional trip hyst attribute */ - if (!tz->ops->get_trip_hyst) - continue; - snprintf(tz->trip_hyst_attrs[indx].name, THERMAL_NAME_LENGTH, - "trip_point_%d_hyst", indx); - - sysfs_attr_init(&tz->trip_hyst_attrs[indx].attr.attr); - tz->trip_hyst_attrs[indx].attr.attr.name = - tz->trip_hyst_attrs[indx].name; - tz->trip_hyst_attrs[indx].attr.attr.mode = S_IRUGO; - tz->trip_hyst_attrs[indx].attr.show = trip_point_hyst_show; - if (tz->ops->set_trip_hyst) { - tz->trip_hyst_attrs[indx].attr.attr.mode |= S_IWUSR; - tz->trip_hyst_attrs[indx].attr.store = - trip_point_hyst_store; - } - - device_create_file(&tz->device, - &tz->trip_hyst_attrs[indx].attr); - } - return 0; -} - -static void remove_trip_attrs(struct thermal_zone_device *tz) -{ - int indx; - - for (indx = 0; indx < tz->trips; indx++) { - device_remove_file(&tz->device, - &tz->trip_type_attrs[indx].attr); - device_remove_file(&tz->device, - &tz->trip_temp_attrs[indx].attr); - if (tz->ops->get_trip_hyst) - device_remove_file(&tz->device, - &tz->trip_hyst_attrs[indx].attr); - } - kfree(tz->trip_type_attrs); - kfree(tz->trip_temp_attrs); - kfree(tz->trip_hyst_attrs); -} - -/** - * thermal_zone_device_register - register a new thermal zone device - * @type: the thermal zone device type - * @trips: the number of trip points the thermal zone support - * @mask: a bit string indicating the writeablility of trip points - * @devdata: private device data - * @ops: standard thermal zone device callbacks - * @tzp: thermal zone platform parameters - * @passive_delay: number of milliseconds to wait between polls when - * performing passive cooling - * @polling_delay: number of milliseconds to wait between polls when checking - * whether trip points have been crossed (0 for interrupt - * driven systems) - * - * thermal_zone_device_unregister() must be called when the device is no - * longer needed. The passive cooling depends on the .get_trend() return value. - */ -struct thermal_zone_device *thermal_zone_device_register(const char *type, - int trips, int mask, void *devdata, - const struct thermal_zone_device_ops *ops, - const struct thermal_zone_params *tzp, - int passive_delay, int polling_delay) -{ - struct thermal_zone_device *tz; - enum thermal_trip_type trip_type; - int result; - int count; - int passive = 0; - - if (type && strlen(type) >= THERMAL_NAME_LENGTH) - return ERR_PTR(-EINVAL); - - if (trips > THERMAL_MAX_TRIPS || trips < 0 || mask >> trips) - return ERR_PTR(-EINVAL); - - if (!ops || !ops->get_temp) - return ERR_PTR(-EINVAL); - - if (trips > 0 && !ops->get_trip_type) - return ERR_PTR(-EINVAL); - - tz = kzalloc(sizeof(struct thermal_zone_device), GFP_KERNEL); - if (!tz) - return ERR_PTR(-ENOMEM); - - INIT_LIST_HEAD(&tz->thermal_instances); - idr_init(&tz->idr); - mutex_init(&tz->lock); - result = get_idr(&thermal_tz_idr, &thermal_idr_lock, &tz->id); - if (result) { - kfree(tz); - return ERR_PTR(result); - } - - strcpy(tz->type, type ? : ""); - tz->ops = ops; - tz->tzp = tzp; - tz->device.class = &thermal_class; - tz->devdata = devdata; - tz->trips = trips; - tz->passive_delay = passive_delay; - tz->polling_delay = polling_delay; - - dev_set_name(&tz->device, "thermal_zone%d", tz->id); - result = device_register(&tz->device); - if (result) { - release_idr(&thermal_tz_idr, &thermal_idr_lock, tz->id); - kfree(tz); - return ERR_PTR(result); - } - - /* sys I/F */ - if (type) { - result = device_create_file(&tz->device, &dev_attr_type); - if (result) - goto unregister; - } - - result = device_create_file(&tz->device, &dev_attr_temp); - if (result) - goto unregister; - - if (ops->get_mode) { - result = device_create_file(&tz->device, &dev_attr_mode); - if (result) - goto unregister; - } - - result = create_trip_attrs(tz, mask); - if (result) - goto unregister; - - for (count = 0; count < trips; count++) { - tz->ops->get_trip_type(tz, count, &trip_type); - if (trip_type == THERMAL_TRIP_PASSIVE) - passive = 1; - } - - if (!passive) { - result = device_create_file(&tz->device, &dev_attr_passive); - if (result) - goto unregister; - } - -#ifdef CONFIG_THERMAL_EMULATION - result = device_create_file(&tz->device, &dev_attr_emul_temp); - if (result) - goto unregister; -#endif - /* Create policy attribute */ - result = device_create_file(&tz->device, &dev_attr_policy); - if (result) - goto unregister; - - /* Update 'this' zone's governor information */ - mutex_lock(&thermal_governor_lock); - - if (tz->tzp) - tz->governor = __find_governor(tz->tzp->governor_name); - else - tz->governor = __find_governor(DEFAULT_THERMAL_GOVERNOR); - - mutex_unlock(&thermal_governor_lock); - - result = thermal_add_hwmon_sysfs(tz); - if (result) - goto unregister; - - mutex_lock(&thermal_list_lock); - list_add_tail(&tz->node, &thermal_tz_list); - mutex_unlock(&thermal_list_lock); - - /* Bind cooling devices for this zone */ - bind_tz(tz); - - INIT_DELAYED_WORK(&(tz->poll_queue), thermal_zone_device_check); - - thermal_zone_device_update(tz); - - if (!result) - return tz; - -unregister: - release_idr(&thermal_tz_idr, &thermal_idr_lock, tz->id); - device_unregister(&tz->device); - return ERR_PTR(result); -} -EXPORT_SYMBOL(thermal_zone_device_register); - -/** - * thermal_device_unregister - removes the registered thermal zone device - * @tz: the thermal zone device to remove - */ -void thermal_zone_device_unregister(struct thermal_zone_device *tz) -{ - int i; - const struct thermal_zone_params *tzp; - struct thermal_cooling_device *cdev; - struct thermal_zone_device *pos = NULL; - - if (!tz) - return; - - tzp = tz->tzp; - - mutex_lock(&thermal_list_lock); - list_for_each_entry(pos, &thermal_tz_list, node) - if (pos == tz) - break; - if (pos != tz) { - /* thermal zone device not found */ - mutex_unlock(&thermal_list_lock); - return; - } - list_del(&tz->node); - - /* Unbind all cdevs associated with 'this' thermal zone */ - list_for_each_entry(cdev, &thermal_cdev_list, node) { - if (tz->ops->unbind) { - tz->ops->unbind(tz, cdev); - continue; - } - - if (!tzp || !tzp->tbp) - break; - - for (i = 0; i < tzp->num_tbps; i++) { - if (tzp->tbp[i].cdev == cdev) { - __unbind(tz, tzp->tbp[i].trip_mask, cdev); - tzp->tbp[i].cdev = NULL; - } - } - } - - mutex_unlock(&thermal_list_lock); - - thermal_zone_device_set_polling(tz, 0); - - if (tz->type[0]) - device_remove_file(&tz->device, &dev_attr_type); - device_remove_file(&tz->device, &dev_attr_temp); - if (tz->ops->get_mode) - device_remove_file(&tz->device, &dev_attr_mode); - device_remove_file(&tz->device, &dev_attr_policy); - remove_trip_attrs(tz); - tz->governor = NULL; - - thermal_remove_hwmon_sysfs(tz); - release_idr(&thermal_tz_idr, &thermal_idr_lock, tz->id); - idr_destroy(&tz->idr); - mutex_destroy(&tz->lock); - device_unregister(&tz->device); - return; -} -EXPORT_SYMBOL(thermal_zone_device_unregister); - -#ifdef CONFIG_NET -static struct genl_family thermal_event_genl_family = { - .id = GENL_ID_GENERATE, - .name = THERMAL_GENL_FAMILY_NAME, - .version = THERMAL_GENL_VERSION, - .maxattr = THERMAL_GENL_ATTR_MAX, -}; - -static struct genl_multicast_group thermal_event_mcgrp = { - .name = THERMAL_GENL_MCAST_GROUP_NAME, -}; - -int thermal_generate_netlink_event(struct thermal_zone_device *tz, - enum events event) -{ - struct sk_buff *skb; - struct nlattr *attr; - struct thermal_genl_event *thermal_event; - void *msg_header; - int size; - int result; - static unsigned int thermal_event_seqnum; - - if (!tz) - return -EINVAL; - - /* allocate memory */ - size = nla_total_size(sizeof(struct thermal_genl_event)) + - nla_total_size(0); - - skb = genlmsg_new(size, GFP_ATOMIC); - if (!skb) - return -ENOMEM; - - /* add the genetlink message header */ - msg_header = genlmsg_put(skb, 0, thermal_event_seqnum++, - &thermal_event_genl_family, 0, - THERMAL_GENL_CMD_EVENT); - if (!msg_header) { - nlmsg_free(skb); - return -ENOMEM; - } - - /* fill the data */ - attr = nla_reserve(skb, THERMAL_GENL_ATTR_EVENT, - sizeof(struct thermal_genl_event)); - - if (!attr) { - nlmsg_free(skb); - return -EINVAL; - } - - thermal_event = nla_data(attr); - if (!thermal_event) { - nlmsg_free(skb); - return -EINVAL; - } - - memset(thermal_event, 0, sizeof(struct thermal_genl_event)); - - thermal_event->orig = tz->id; - thermal_event->event = event; - - /* send multicast genetlink message */ - result = genlmsg_end(skb, msg_header); - if (result < 0) { - nlmsg_free(skb); - return result; - } - - result = genlmsg_multicast(skb, 0, thermal_event_mcgrp.id, GFP_ATOMIC); - if (result) - dev_err(&tz->device, "Failed to send netlink event:%d", result); - - return result; -} -EXPORT_SYMBOL(thermal_generate_netlink_event); - -static int genetlink_init(void) -{ - int result; - - result = genl_register_family(&thermal_event_genl_family); - if (result) - return result; - - result = genl_register_mc_group(&thermal_event_genl_family, - &thermal_event_mcgrp); - if (result) - genl_unregister_family(&thermal_event_genl_family); - return result; -} - -static void genetlink_exit(void) -{ - genl_unregister_family(&thermal_event_genl_family); -} -#else /* !CONFIG_NET */ -static inline int genetlink_init(void) { return 0; } -static inline void genetlink_exit(void) {} -#endif /* !CONFIG_NET */ - -static int __init thermal_init(void) -{ - int result = 0; - - result = class_register(&thermal_class); - if (result) { - idr_destroy(&thermal_tz_idr); - idr_destroy(&thermal_cdev_idr); - mutex_destroy(&thermal_idr_lock); - mutex_destroy(&thermal_list_lock); - return result; - } - result = genetlink_init(); - return result; -} - -static void __exit thermal_exit(void) -{ - class_unregister(&thermal_class); - idr_destroy(&thermal_tz_idr); - idr_destroy(&thermal_cdev_idr); - mutex_destroy(&thermal_idr_lock); - mutex_destroy(&thermal_list_lock); - genetlink_exit(); -} - -fs_initcall(thermal_init); -module_exit(thermal_exit); diff --git a/drivers/thermal/user_space.c b/drivers/thermal/user_space.c index 6bbb380b6d19..10adcddc8821 100644 --- a/drivers/thermal/user_space.c +++ b/drivers/thermal/user_space.c @@ -22,9 +22,6 @@ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include #include #include "thermal_core.h" @@ -46,23 +43,15 @@ static int notify_user_space(struct thermal_zone_device *tz, int trip) static struct thermal_governor thermal_gov_user_space = { .name = "user_space", .throttle = notify_user_space, - .owner = THIS_MODULE, }; -static int __init thermal_gov_user_space_init(void) +int thermal_gov_user_space_register(void) { return thermal_register_governor(&thermal_gov_user_space); } -static void __exit thermal_gov_user_space_exit(void) +void thermal_gov_user_space_unregister(void) { thermal_unregister_governor(&thermal_gov_user_space); } -/* This should load after thermal framework */ -fs_initcall(thermal_gov_user_space_init); -module_exit(thermal_gov_user_space_exit); - -MODULE_AUTHOR("Durgadoss R"); -MODULE_DESCRIPTION("A user space Thermal notifier"); -MODULE_LICENSE("GPL"); diff --git a/drivers/tty/serial/68328serial.c b/drivers/tty/serial/68328serial.c index ef2e08e9b590..5dc9c4bfa66e 100644 --- a/drivers/tty/serial/68328serial.c +++ b/drivers/tty/serial/68328serial.c @@ -14,7 +14,6 @@ * 2.4/2.5 port David McCullough */ -#include #include #include #include diff --git a/drivers/tty/serial/bcm63xx_uart.c b/drivers/tty/serial/bcm63xx_uart.c index 52a3ecd40421..6fa2ae77fffd 100644 --- a/drivers/tty/serial/bcm63xx_uart.c +++ b/drivers/tty/serial/bcm63xx_uart.c @@ -30,7 +30,6 @@ #include #include -#include #include #include #include diff --git a/drivers/usb/host/ehci-tilegx.c b/drivers/usb/host/ehci-tilegx.c index 1d215cdb9dea..b083a350eea3 100644 --- a/drivers/usb/host/ehci-tilegx.c +++ b/drivers/usb/host/ehci-tilegx.c @@ -118,8 +118,10 @@ static int ehci_hcd_tilegx_drv_probe(struct platform_device *pdev) hcd = usb_create_hcd(&ehci_tilegx_hc_driver, &pdev->dev, dev_name(&pdev->dev)); - if (!hcd) - return -ENOMEM; + if (!hcd) { + ret = -ENOMEM; + goto err_hcd; + } /* * We don't use rsrc_start to map in our registers, but seems like @@ -176,6 +178,7 @@ err_have_irq: err_no_irq: tilegx_stop_ehc(); usb_put_hcd(hcd); +err_hcd: gxio_usb_host_destroy(&pdata->usb_ctx); return ret; } diff --git a/drivers/usb/host/ohci-tilegx.c b/drivers/usb/host/ohci-tilegx.c index 1ae7b28a71c2..ea73009de623 100644 --- a/drivers/usb/host/ohci-tilegx.c +++ b/drivers/usb/host/ohci-tilegx.c @@ -112,8 +112,10 @@ static int ohci_hcd_tilegx_drv_probe(struct platform_device *pdev) hcd = usb_create_hcd(&ohci_tilegx_hc_driver, &pdev->dev, dev_name(&pdev->dev)); - if (!hcd) - return -ENOMEM; + if (!hcd) { + ret = -ENOMEM; + goto err_hcd; + } /* * We don't use rsrc_start to map in our registers, but seems like @@ -165,6 +167,7 @@ err_have_irq: err_no_irq: tilegx_stop_ohc(); usb_put_hcd(hcd); +err_hcd: gxio_usb_host_destroy(&pdata->usb_ctx); return ret; } diff --git a/drivers/usb/phy/Kconfig b/drivers/usb/phy/Kconfig index aab2ab2fbc90..371d0e74e909 100644 --- a/drivers/usb/phy/Kconfig +++ b/drivers/usb/phy/Kconfig @@ -128,7 +128,7 @@ config TWL6030_USB config USB_GPIO_VBUS tristate "GPIO based peripheral-only VBUS sensing 'transceiver'" - depends on GENERIC_GPIO + depends on GPIOLIB help Provides simple GPIO VBUS sensing for controllers with an internal transceiver via the usb_phy interface, and diff --git a/drivers/video/Kconfig b/drivers/video/Kconfig index c04ccdf60eaa..d71d60f94fc1 100644 --- a/drivers/video/Kconfig +++ b/drivers/video/Kconfig @@ -2429,7 +2429,7 @@ config FB_MXS select FB_CFB_COPYAREA select FB_CFB_IMAGEBLIT select FB_MODE_HELPERS - select OF_VIDEOMODE + select VIDEOMODE_HELPERS help Framebuffer support for the MXS SoC. @@ -2483,7 +2483,7 @@ config FB_SSD1307 tristate "Solomon SSD1307 framebuffer support" depends on FB && I2C depends on OF - depends on GENERIC_GPIO + depends on GPIOLIB select FB_SYS_FOPS select FB_SYS_FILLRECT select FB_SYS_COPYAREA diff --git a/drivers/video/au1100fb.c b/drivers/video/au1100fb.c index ddabaa867b0d..700cac067b46 100644 --- a/drivers/video/au1100fb.c +++ b/drivers/video/au1100fb.c @@ -111,30 +111,16 @@ static int au1100fb_fb_blank(int blank_mode, struct fb_info *fbi) switch (blank_mode) { case VESA_NO_BLANKING: - /* Turn on panel */ - fbdev->regs->lcd_control |= LCD_CONTROL_GO; -#ifdef CONFIG_MIPS_PB1100 - if (fbdev->panel_idx == 1) { - au_writew(au_readw(PB1100_G_CONTROL) - | (PB1100_G_CONTROL_BL | PB1100_G_CONTROL_VDD), - PB1100_G_CONTROL); - } -#endif + /* Turn on panel */ + fbdev->regs->lcd_control |= LCD_CONTROL_GO; au_sync(); break; case VESA_VSYNC_SUSPEND: case VESA_HSYNC_SUSPEND: case VESA_POWERDOWN: - /* Turn off panel */ - fbdev->regs->lcd_control &= ~LCD_CONTROL_GO; -#ifdef CONFIG_MIPS_PB1100 - if (fbdev->panel_idx == 1) { - au_writew(au_readw(PB1100_G_CONTROL) - & ~(PB1100_G_CONTROL_BL | PB1100_G_CONTROL_VDD), - PB1100_G_CONTROL); - } -#endif + /* Turn off panel */ + fbdev->regs->lcd_control &= ~LCD_CONTROL_GO; au_sync(); break; default: diff --git a/drivers/video/backlight/Kconfig b/drivers/video/backlight/Kconfig index 2e166c3fc4c3..d5ab6583f440 100644 --- a/drivers/video/backlight/Kconfig +++ b/drivers/video/backlight/Kconfig @@ -36,14 +36,14 @@ config LCD_CORGI config LCD_L4F00242T03 tristate "Epson L4F00242T03 LCD" - depends on SPI_MASTER && GENERIC_GPIO + depends on SPI_MASTER && GPIOLIB help SPI driver for Epson L4F00242T03. This provides basic support for init and powering the LCD up/down through a sysfs interface. config LCD_LMS283GF05 tristate "Samsung LMS283GF05 LCD" - depends on SPI_MASTER && GENERIC_GPIO + depends on SPI_MASTER && GPIOLIB help SPI driver for Samsung LMS283GF05. This provides basic support for powering the LCD up/down through a sysfs interface. diff --git a/drivers/video/mxsfb.c b/drivers/video/mxsfb.c index 1b2c26d1658c..21223d475b39 100644 --- a/drivers/video/mxsfb.c +++ b/drivers/video/mxsfb.c @@ -42,7 +42,6 @@ #include #include #include -#include